'Lack of data' is a phrase littered throughout the scientific literature across most disciplines. Testing a concept, controlling a disease or mapping the niche of a species are all considerably more difficult, and considerably more uncertain, where data are sparse.
Despite a number of mosquito species of the genus Anopheles being recognised as the transmitters of human malaria since the late 19th Century, and a huge amount of effort from numerous researchers ever since, there is still a huge dearth of data regarding the ecology, behaviour and sometimes even the identity of many of the world's most prominent malaria vectors. Indeed, previous attempts to map the location of some of these species using niche models has had to rely on as few as 12 occurrence points across a whole continent .
Mosquito vector data is difficult and expensive to collect. The most effective way to find out if an area has human-biting mosquitoes capable of transmitting malaria (or indeed any other mosquito-borne disease) is to allow them to attempt to bite a human. Other methods of capture, such as light traps or CO~2~ baited traps do not always capture the same number or range of species. Some mosquitoes just prefer to bite people (and this is what makes them so dangerous when they bite people who have malaria).
Species niche models aim to identify the niche of a species using information such as local climate, environment or vegetation from sites with known presence or absence of the species. Once the preferred niche has been identified predictions can be made about where else the species may be present, based on identifying other locations where these niche conditions are found. Thus species niche maps can be created. The availability of satellite-derived covariate data such as those describing vegetation cover (e.g. NDVI—Normalised Difference Vegetation Index, EVI—Enhanced Vegetation Index, MIR—Middle Infra-red Radiation (discriminates different types of landcover)) or other pertinent variables such as LST—Land Surface Temperature or DEM—Digital Elevation Model (altitude) as well as global weather and land cover datasets have provided huge opportunity for producing wide ranging niche maps—yet the limiting factor is, as ever, a lack of occurrence data.
In an ideal world, the input data would be a comprehensive and widespread set of geo-referenced occurrences of fully identified species collected using comparable sampling methods and effort, all conducted during the same time period. In reality we mostly make do with flawed, small datasets that more accurately "[...] reflect the distribution of entomologists rather than mosquito species"  and consequently give a high level of uncertainly in the maps produced.
Accurate location data, combined with direct measurements of the surrounding conditions (humidity, temperature, etc.) as well as the surrounding vegetation will allow niche models to be able to distinguish specific niche characteristics more accurately. By doing so, these models will help provide a greater understanding of the ecology of these insects and allow a more accurate estimate of the extent of their fundamental niche. Facilitating the collection of such data is the ultimate aim of HumBug.
Alongside the development of our sensor, we are updating our existing database of Asia-Pacific vector occurrence data. Despite collating the most comprehensive and complete dataset of published occurrence data for the 19 prime vector species and species complexes of the Asia-Pacific region, time constraints prevented us disseminating the species complex data into sibling species.
Sibling species are a morphologically identical, closely related species that can have variability in their behaviour that can result in one sibling being a prominent and dangerous disease vector whereas a second sibling, for example by maintaining a high preference to feed only on animals, remains a non vector. Discerning between siblings when considering occurrence data is highly desirable. Yet to do so often requires the development of molecular markers and techniques requiring a time commitment and cost not easily achievable by those trying to monitor local vector populations in disease risk areas.
We aim to update our existing database to include all available data on sibling species in the Asia-Pacific region as well as updating the dataset with the more recent published occurrences. In addition, we will be turning our attention to finding unpublished data sources to add to our dataset. As such, we will be very grateful to receive any data to include in our database, all of which is, and will remain, freely available.
 Sinka ME, Bangs MJ, Manguin S, Chareonviriyaphap T, Patil AP, Temperley WH, Gething PW, Elyazar IR, Kabaria CW, Harbach RE, Hay SI: The dominant Anopheles vectors of human malaria in the Asia-Pacific region: occurrence data, distribution maps and bionomic précis. Parasit Vectors 2011, 4:89.
 Coetzee M: Distribution of the African malaria vectors of the Anopheles gambiae complex. The American Journal of Tropical Medicine and Hygiene 2004, 70:103-104.