Abstract

Within conservation and ecology, volunteer participation has always been an important component of research. Within the past two decades, this use of volunteers in research has proliferated and evolved into “citizen science.” Technologies are evolving rapidly. Mobile phone technologies and the emergence and uptake of high-speed Web-capable smart phones with GPS and data upload capabilities can allow instant collection and transmission of data. This is frequently used within everyday life particularly on social networking sites. Embedded sensors allow researchers to validate GPS and image data and are now affordable and regularly used by citizens. With the “perfect storm” of technology, data upload, and social networks, citizen science represents a powerful tool. This paper establishes the current state of citizen science within scientific literature, examines underlying themes, explores further possibilities for utilising citizen science within ecology, biodiversity, and biology, and identifies possible directions for further research. The paper highlights (1) lack of trust in the scientific community about the reliability of citizen science data, (2) the move from standardised data collection methods to data mining available datasets, and (3) the blurring of the line between citizen science and citizen sensors and the need to further explore online social networks for data collection.

1. Introduction

Within conservation and ecology, volunteer participation has always been an important component of research [15]. Within the past two decades, use of volunteers in research has begun to proliferate and evolve into the current form of “citizen science” [6, 7]. Citizen science, a term first coined by Irwin [7], is used to describe a form of research collaboration or data gathering that is performed by untrained or “nonexpert” individuals, often involving members of the public, and frequently thought of as a form of crowd-sourcing [1, 812].

Citizen science will usually incorporate an element of public education [2, 6, 1315]. Silvertown [5] described the differentiation between historical and modern forms of citizen science by potential for it to be “available to all, not just a privileged few.” This has been recently demonstrated by the rapid development of mobile phone technologies, in particularly the emergence and uptake of high-speed Web-capable smart phones with GPS data collection facilities and data upload capabilities [16]. This allows almost instant collection, transmission, and submission of data and provides researchers with a way to validate data (e.g., to verify the identification of an organism or the location through GPS locators) [10]. The availability of new technologies containing sensors could be argued to move citizen science into a new era whereby citizen scientists also become citizen “sensors.” Collection of high-quality data can be made through the sensing capabilities of personal computing and communication technologies, making the user part of a more passive framework for data collection [1719]. Some of the key strengths of citizen science projects lie in the ease and speed with which data can be gathered by a large number of individuals in a short time. Ordinarily constraints such as money and time would make studies unfeasible or impossible for an individual organisation [10, 15, 20]. Indeed, citizen science programmes are often more resilient to variations in financial support than other programs [19, 21, 22].

With technological connectivity peaking, the ability to select virtual “field assistants,” to help gather data is within easy reach; indeed Irwin [7] said that citizen scientists can be considered as the “world’s largest research team.” A further step is the potential for mining data, for ecological or biological research, from the huge quantities of data which are voluntarily uploaded onto personal social media accounts for the primary reason of storage or sharing with friends. For example, there are over 26,000 images tagged with “manta ray” on Flickr (as of December 12th 2011), a species with stable patterning that can be individually identified [23]. Custom Application Programming Interfaces (API) could theoretically identify these individuals and collect GPS data (where its available). This would create vast quantities of ecological and spatial data that could be utilised in research which tracks individuals. However, despite this, citizen science projects are often limited to: (1) informal education activities or outreach to promote understanding [1, 6, 14, 24, 25]; (2) natural resource monitoring to promote stewardship [2628]; (3) to promote social activities and action [29, 30]; (4) purely virtual whereby the entire project is ICT-mediated with no physical attribute (e.g., classifying photographs) ([31, 32], see Table 1). Table 2 provides examples of citizen science projects alongside their primary goals.

Few scientific investigative projects exist in ecology or biology using these new technologies for data collection, and where they do, they often encounter difficulties with gaining robust data [5, 6]. Even less take advantage of the rapidly increasing and evolving capabilities of Web 2.0 and social networks such as Facebook (http://www.facebook.com/), Twitter (http://www.twitter.com/) and Flickr (http://www.flickr.com/), through which millions of people upload and share photographs and location data, many citizen science studies also concentrate on data being collected within a very rigid framework, very similar to previous volunteer data collection whereby paper forms are replaced by online submission forms (examples include e-bird, Project Budburst, What’s Invasive, and Neighbourhood Nestwatch Program). The possibility of using Web 2.0 and less rigid data collection techniques is relatively underexplored within scientific literature, even less so for biological and ecological applications.

This paper will predominantly cover the uses of citizen science for ecology, biodiversity, and biological insights. However, it may touch on various interdisciplinary citizen science programs or concepts where it is felt that it will be beneficial and may bring together other approaches which may add value. The aim is to establish the current state of citizen science within scientific literature, examine main underlying themes, and explore the possibility of utilising an untapped resource and the benefits that this can hold for the scientific community. It will also attempt to identify possible directions for further research.

2. The Citizen Science Landscape

The ability for intense monitoring by expert individuals on any subject ranging from individual or species distributions to tracking of invasive species is severely limited by both logistical and financial constraints. There are simply not enough resources, whether this is in the form of time, personnel, or money to establish large scale datasets  [6, 10, 33, 34]. Citizen science circumvents many of these problems and has proven effective in a number of research areas that can have difficulty gathering large datasets. The areas in which it has, and most probably will continue to have, the greatest impact and potential are that of monitoring ecology or biodiversity at large geographic scales (see Table 3 for examples). This is particularly prevalent due to the recent proliferation of built in GPS technology and Web-capable features that many handheld devices, such as mobile phones and increasingly cameras, now have in an affordable and widely available format [10, 11].

When monitoring for rare, unusual, or declining phenomena, the scale of a large workforce over a large area will increase rates of detection in comparison to a lone researcher on a strict rotation despite having greater expert knowledge [35]. Indeed in early 2006, the rare nine-spotted ladybird (Coccinella novemnotata) was rediscovered during a citizen science programme designed to educate the public in biodiversity and conservation. This nine-spotted ladybird was the first discovered in eastern North America in over fourteen years, and only the sixth in the whole of North America within 10 years [36].

Traditional citizen science or volunteer programs have resulted in some of the longest ecological temporal datasets that we can access, particularly in the field of ornithology. The Christmas Bird Count (CBC—http://birds.audubon.org/christmas-bird-count/) was launched 1900 by the Audubon Society (in US and Canada) and provides long-term comprehensive data trends for many species for over 100 years. The British Trust for Ornithology, founded in 1932, also regularly uses data collected by amateur birdwatchers and makes up a very substantial amount of the National Biodiversity Network (http://www.nbn.org.uk/) which contains over 31 million records. The data of these programmes have helped to inform conservation actions, for example, by providing information to target conservation management at particular sites by environmental organisations [37].

Citizen science programmes conducted in the last 10 years have successfully followed the spread of invasive species or diseases, impacts of land use or climate change, and have been instrumental in understanding distributions, ranges, and migration pathways (e.g.,[38, 39]). Researchers at Cornell University, USA, have performed a large range of citizen science projects centred around avian species. Some of these projects have resulted in datasets that track the spread of conjunctivitis (Mycoplasma gallisepticum) in wild house finches (Carpodacus mexicanus)[40] and the impact of forest fragmentation on tanager populations and nesting success [41]. These efforts have led to a large database called eBird, where amateur birdwatchers can upload sightings. These citizen science data have become the basis of trends discovered through data mining and modelling techniques, which have led to further more focussed studies (visit http://ebird.org/content/ebird/about/ebird-publications for more information and a full list of publications).

Datasets that have been gathered for a specific purpose will often result in unexpected phenomena or patterns emerging, that will then promote further more focussed studies. Many studies are available in scientific literature where data mining and model construction have resulted in the discovery of new patterns and processes being found in ecological systems (e.g.,[4244]). Howard and Davis [45, 46] have published a number of peer-reviewed papers on data predominantly collected by citizen scientists, gathering useable scientific data on autumn migration flyways of monarch butterflies (Danaus plexippus). Citizen scientists record overnight roosts and report their first spring sightings to assess spring recolonisation rates.

One of the common features of traditional and many current projects is the formal submission process which occurs on a stand-alone Website or through one-to-one communication between researcher and citizen. The submission is often closed or inaccessible until a result is published, and even in citizen science programmes where data is shared: it is very difficult for the ordinary citizen to visualize; this is shown to have an impact on participation [6, 47]. e-Bird has gone through some lengths to overcome this. By creating an online database system which has many portals and visualisation techniques, citizen scientists and researchers alike can explore the e-Bird database [12]. In April 2006, when this newly improved Website was upgraded allowing participants to explore their own and others data, the number of individuals submitting data nearly tripled [47]. Resources such as this require the citizen scientist to make an active effort to discover the project, find the Website and input, and retrieve data. By integrating data collection into social media and fully exploiting Web 2.0, the quality, geographical range, and quantity of data collected could potentially be significantly increased, and this is something that requires further research. However, despite the lack of financial cost that social media and Web 2.0 present, it is possible that the time and effort cost might not make the process worthwhile when considering the amount of additional data gained.

3. Social Networks and Web 2.0

Web 2.0 is an ambiguous term with almost as many facets and conflicting opinions and definitions as the term citizen science, some even argue against the existence of Web 2.0 as a concept. However, for the purposes of this paper Web 2.0 can be regarded as the socially connected and interactive internet which facilitates participatory data sharing and encourages user-generated content. This medium consists of blogs, podcasts, social networking sites, wikis, crowd-sourcing tools, and “cloud-based” group working environments. Web 2.0 has been expanded to a mobile computing context with the proliferation of new technologies such as smart phones, laptops, and tablet computers [48].

The most obvious purpose for exploiting Web 2.0, which is beginning to be used by researchers, is the power of marketing and advertising, expressing branding, recruiting, retaining, and sharing, and collecting data with the citizen scientist [5, 49]. Delaney et al. [50] advocated the use of Web 2.0 capabilities for the ease of collecting and sharing data via new cloud technologies. Delaney suggests that dynamic linked databases that use online mapping technology such as Google Earth (free and familiar to citizen scientists) would prove ideal for creating a complete graphical “global” database of species. This would likely increase engagement and retention of individuals as they watch their contributions become part of the “bigger picture.” In essence, social media is being adopted as part of the communication strategy for engaging individuals who collect data or participate in virtual citizen science programs; this adoption is seemingly in line with that of organisations at large to promote products or engage audiences. This paper will not examine these factors in depth, as they are far too large to be able to cover appropriately (for more further information on this topic, (see [5153])). Society at large is beginning to understand the increased power of “the social network effect” behind Web 2.0, which increases value to existing users in a feedback loop (e.g., more and more users begin to embrace a service, increasing its popularity, and resulting in rapidly increasing adoption) [5456].

Figure 1 shows a brief diagram of citizen science. In addition to running programs of research that encourage users to engage in a more traditional data submission process, there is also the underexplored option of mining data from social networks and taking a more opportunistic approach. Indeed, many images, especially those taken on mobile phones, contain GPS information and can readily be searched and mapped via the integrated search facilities on Websites. The mobile interface allows the mobile phone to become a people-centric sensor which is capable of aggregating inputs from local surroundings, enabling data to be collected at a higher resolution [57]. This may be useful in plotting distributions and migration patterns or movements, both of individuals or species. Indeed, large charismatic species with stable patterning such as whales, sharks, rays, and big cats are photographed regularly by tourists and shared online, and the ability to collate and analyse these images could prove valuable to the study of their movement, social grouping, and ultimately conservation.

An emerging and particularly promising but under developed area of citizen science is that of using online social networking sites such as Facebook, Twitter, and mobile social networks such as Foursquare. Many of these have integrated image and location data upload facilities. Indeed, throughout 2011 there has been a proliferation of these facilities throughout popular social networking Websites. These features have been incorporated into basic interfaces, enabling users to simultaneously capture images; GPS tag them, add, comments, and post, to followers or friends instantly via mobile internet.

Since it’s advent in 2004, Facebook (http://www.facebook.com/), the most popular social networking site, has grown to having more than 800 million active users globally, with, on average, more than 250 million photographs uploaded every day. More than 350 million people access it through a mobile phone [58]. Research by commercial online marketing and data collection agency comScore Media Metrix suggested that Facebook reached 73% of Americans in June 2011 [59]. With Flickr, the story is similar; Yahoo! announced in August 2011 that it had reached 51 million users and had, on average, 4.5 million photos uploaded every day. On the February 28th 2012 it had 176,605,443 geo-tagged photographs in total. With an integrated approach and the correct marketing and publicity, in addition to the increase of GPS-capable mobile devices it is likely that Flickr may become increasingly useful for gathering data, particularly for charismatic species.

The potential for scientific research is immense, particularly for image-based data collection where EXIF information can be mined using a custom API and identification can be verified by trained individuals or automatic recognition software [6063]. Figure 2 shows a small example of what Flickr can do with a simple search term “monarch butterflies butterfly” (signifying a search for either butterflies or butterfly) which pulls up 13,329 geo-tagged photos within the US (February 28th 2012), 250 of which it can plot on a map on the Flickr Website. The map seemingly holds a cursory resemblance to Howard and Davis’s [45] map of monarch butterfly migration roosts (created using data from a citizen science program called Journey North which relies on a more traditional data submission process albeit via an online form—http://www.learner.org/jnorth/). Using a custom API and transposing all results onto Google Maps or other mapping software, it would be possible to limit the geo-tagged photo search by date and compare it directly with Journey North’s monarch butterfly monitoring program, which has received 4078 sightings within the last year. However, without creating an API, a simple search on Flickr’s advanced search facility with the search term “monarch butterfly” brought up 15,499 photographs within the same time period (using data collected on February 28th 2012). Despite being likely that a large proportion contains no useful information (i.e., not pictures of the target species) and/or is not geo-tagged, (although estimates show >40% may be geo-tagged, [64]), this suggests that if this method of data collection was further explored the number of potentially useful monarch butterfly sightings data could be greatly increased.

Currently, a general internet user’s image and location uploads are predominantly limited to “events” that the user wants to share this might be “checking in” to restaurants, attractions, clubs, cinemas, or concerts, often reviewing products, or sharing visual experiences [65, 66]. Sharing these data with another user can be as simple as tagging them [66]. By exploiting social networks in this way, for ecological or biological research, many of the most common mistakes or inaccuracies that are found within volunteered data could be minimised. For example, by sharing images, and temporal and GPS data, misidentifications and location inaccuracies can be flagged and checked by trained individuals [5, 37, 64, 67, 68]. Despite this, there are very few examples of social networking sites being used actively to collect data for biological or ecological research; this may be because of confusion over copyright laws or limitations of API systems. At time of writing, there are very few examples of such usages, and those few that do exist are limited to self-contained “groups” within Flickr which search images of individual animals to export to an external catalogue for identification or use them to advertise the program and attract new submissions (Table 4).

Despite this ability to gather data quickly, they are currently underutilised for ecological or biodiversity data collection. BeeID is a program of research which used Flickr as a base for data collection [64, 67]. Researchers asked individuals to tag photographs of bees with specific searchable metatags and place location data on them if it was not already embedded. Trained individuals then confirmed species identification and marked the images as processed via the addition of a new tag. A simple custom API extracted tagged photographs from Flickr and collected the data which successfully plotted bee species distributions. Considering the project had no funding and was run by a small group of individuals with limited promotion other than on social networking sites, its success demonstrates the potential benefits of using social networking for collection of scientific data. Furthermore, the study took part before the recent integration of easily accessible location data in social networks and the continued rise of smart-phone and affordable GPS and wifi enabled camera ownership.

Another facet of Web 2.0 is the very recent addition of phone applications or “apps.” These are easily integrated and simple to use; however, the release of a mobile application is not enough on its own to motivate participants and it is important to use mobile applications in an holistic approach [12]. “What’s Invasive” is a very recent citizen science programme which uses a combination of a Website and custom mobile application to allow mobile devices to collect and submit information about invasive species whilst they are observing them (http://whatsinvasive.com/). Project Noah is similar in that respect but is built primarily to engage and educate individuals in addition to collecting species data through a tagging and classification system. Project Noah also incorporates “missions” to increase motivation and promote the collection of specific species sightings (http://www.projectnoah.org/).

A recently developed formatting language, Hypertext Markup Language 5 (HTML5), allows easier development across platforms and allows many of the features of mobile phone applications to be incorporated into Websites. Web pages can then be developed to contain full multimedia content that is easily accessible to popular technologies, something which some smart phones have found problematic due to limited Flash support (especially on Apple devices). In the past, this inability has limited some of the content available and increased the amount of work needed to replicate Web pages on smart-phones.

Undoubtedly, with the advent of Web 2.0 and the quickly developing technological breakthroughs, citizen science programs exploiting this technology are likely to increase exponentially in future years and should be encouraged. It is hoped that as the full potential is revealed the negative bias among the scientific community that such approaches have attracted will begin to lessen. As the population increases and we are more isolated from nature and wildlife, the use of citizen science for biodiversity studies will enable individuals to be further engaged in decision-making processes and the championing and protection of the natural environment. It is a paradigm that is evolving alongside our relationship with technology, our environment and urban ecology and cannot be ignored [69].

4. Trust and Reliability

The reluctance of the scientific community seems to predominantly stem from a mistrust of citizen science datasets due to the lack of validity assessments in academic research and published literature [70, 71]. Although many recognise that citizen science has increased the amount of data that is available, it is a concern that the quality, reliability, and overall value of these data is still preventing its adoption in many research programmes [72]. Assurance of the quality of the data is needed through rigorous scientific methods in order to allow the acceptance of citizen science data into the scientific field [20].

The literature suggests that the reliability of inherently patchy data is the most questioned aspect of citizen science. Thus, being able overcome this mistrust, a huge untapped resource of citizen scientists could be opened up, increasing the scope and insight of conducted research. Potentially, this could result in large standardised spatial and temporal datasets collected by citizen sensor networks [71]. Traditional solutions to gaining credibility are to provide reliable information or gain credentials such as qualifications; however, this works only when there are “gatekeepers” to filter information, something which is not possible with the internet on a global scale [73].

The dependability of volunteer-derived data is an old problem within biology and ecology, and therefore a number of methods to help to increase the reliability of the information gathered have been developed [6, 22]. Firstly, the researchers must concisely and without jargon ask the right questions in the right way to get the quality of answer that is needed, and instructions and processes must be clear and as simple as possible [3, 911]. Projects are usually kept relatively simple; for example, they might include counting a few common avian species frequenting a feeding table rather than searching for rare or difficult to spot species [6, 22, 74, 75]. Projects that require higher levels of skill can be successfully developed; however, they may require additional training or longevity of participation in order to increase experience indeed, many volunteer programs document “learner” effects whereby data collectors become more accurate and correct over time [6, 10, 22, 7680]. Some of the online citizen science programmes that Cornell University has run in the past incorporate short tests and quizzes which help in assessing a contributors’ knowledge; they have also implemented an automated meso-filter which evaluates data input and evaluates it based on already known parameters, submissions which fall out of these categories are flagged for expert review, the contributor contacted, and the entry either verified or disregarded [6, 10, 81].

Although there is not enough space to review all the literature which has been published as a result of data collected through the use of citizen science participation, literature searching has resulted in the location of over 300 instances of peer-reviewed publications. This suggests that citizen science has and will continue to produce usable forms of data (See Figure 3). As with any data, datasets should be approached with caution and “cleaned” or “scrubbed” before performing analysis to remove any obvious outliers [82]. The literature suggests, however, that if the program protocols have been properly formed and tailored to the appropriate audience data does not often differ significantly from expert data collection. Delaney et al. [50] found that far from overlooking data collection methods novices were “more careful” in their measurements and annotations, due to are being very aware of their novice status and shown in many studies to yield similar results to experts [22, 50, 83]. Delaney et al. [50] found experts and nonexperts did not come up with any significantly statistical differences, indeed students were found to be between 80 and 95% accurate with identification, with significant predictors of accuracy being their age and level of education. Dickinson et al. [10] reported that during Project FeederWatch between 2008 and 2009 they received 1,342,633 observations, out of those 378 records required “flagging” resulting in 158 records (54%) being confirmed, 45 identifications (16%) being corrected, and 88 reports (30%) being disregarded due to too little evidence.

Indeed, the very nature of gathering large sets of data results in decreased detrimental effects of “noise”, greater statistical power and increased robustness, as statistical power is a function of sample sizes [22, 37]. Therefore, the common belief that volunteer collected data can only provide noisy and unreliable results that lack precision is generally incorrect [22, 37]. LePage and Francis [74] compared two citizen science programs with similar data collection protocols to test whether population patterns and distributions were temporally and spatially consistent. The study successfully showed that the two citizen science led studies, Christmas Bird Count and Project FeederWatch, had comparable trends and patterns across the same time periods, suggesting that the data was consistent and not significantly influenced by different methods and biases. The benefit of these larger datasets is that they allow researchers to draw broader conclusions across large spatial or temporal scales, enabling researchers to make inferences and robust cases for causation over a larger areas, and at a finer resolution, in contrast with small scale studies which cannot be “generalised” over greater areas [3, 6, 9, 11, 38].

It is, however, important to recognise that these datasets can be compromised by potential lack of precision, inherent biases, and uncertainties which are often present within these extensive studies [11, 22, 84]. For example, you may have more reports of species in areas that are highly populated by humans than in those that are sparsely populated, or more reports of species that are less cryptic than others. It is therefore a challenge to determine whether the data is correct or the reports are biased; this is the reason why many citizen science programs are so rigidly composed and use standardised protocols which are replicated across many stratified surveyed plots (see Table 5 and [11, 22, 84]). It is therefore important to ensure, in hypothesis driven studies, that sampling design does not introduce bias, and that counts are shaped by the data and not the ability of the observer to detect or record data [85]. This is partially why using such count data to establish index of abundance can be scientifically hazardous; however, by using capture-recapture algorithms, conversion to actual population estimates can be made and therefore data can be used to make a valid conclusion [86, 87].

Well known and successful UK citizen science-based programmes are those which are based in the public’s back gardens. The British Trust for Ornithology’s (BTO) and Royal Society for Protection of Birds’ (RSPB’s) garden-based citizen science programmes have been very successful in collecting biodiversity data, particularly on avian species. The “Garden BirdWatch” and “Big Garden Weigh-In” run by the BTO and the “Big Garden Birdwatch” and ‘Make Your Nature Count survey’ run by the RSPB are just a few of the citizen science programmes which encourage the recording of species which are visiting their gardens. For a full list of citizen science projects run by these organisations visit the BTO (http://www.bto.org/) and RSPB (http://www.rspb.org.uk/) websites.

These programmes have a number of key design similarities which help standardise the survey and mitigate against some of the perceived problems involved with nonexpert individuals collecting data. Indeed, they have proved to be reliable enough to result in published scientific papers. The Garden BirdWatch alone has resulted in 15 published scientific papers in addition to providing a strong set of baseline data (visit http://www.bto.org/volunteer-surveys/gbw/publications/papers for full list of publications).

To prevent confounding seasonal variation and to ensure continuity of recording effort citizen scientists are asked to record species within a given survey period, the Big Garden Weigh-In ran between the May 31st and June 5th in 2012 for example. To standardise effort the records are gathered over a particular time period, an hour is the most popular time, and many of the surveys require the species to be physically within the garden (not in a neighbouring garden or flying over). The Garden BirdWatch asks observers to repeat this recording at the same time and from the same place and of the same area for each recording session during the survey period.

Pseudoreplication is combated by removing the difference in the ability of the observer to identify different individuals; this is achieved by recording the maximum number of individual birds present at any one time within the garden. So if an observer sees one Blue Tit at the beginning of the survey but five in the middle and two towards the end of the survey, they would report it as five Blue Tits.

The species which are surveyed are also reduced to a range of easily identifiable species. The Garden Weigh-In reduces the number of birds under observation to 60 avian species which compose the core avian community. The Garden BirdWatch reduces the number further to the 42 most commonly recorded birds (nationally), with a further breakdown resulting in a list of the top ten which can have further detail added. The Big Garden BirdWatch reduces it further still providing a list of 20 more common species and ask observers to also record incidental records of other species that they might see on a separate sheet. The Garden BirdWatch goes one step further to collect additional data and provides a presence and absence record sheet for all species not mentioned.

The key difference between the RSPB and BTO’s citizen science programmes is the method of collection. The RSPB has no paper-based submission format, but the BTO does, with a scanning machine which automates the data retrieval and decoding from the paper-based forms. The BTO suggests that the “relative proportions of participants submitting returns on paper and online are similar.”

Neither of these programmes use social networks for more than publicity. In 2012, the BTO began the Cuckoo Tracking project, whereby tagged Cuckoo’s were tracked during their migrations (http://www.bto.org/science/migration/tracking-studies/cuckoo-tracking). As part of the publicity, sightings were called for and the “hashtag” #heardacuckoo was created on the social network Twitter to publicise the project. Many individuals used the hashtag to report when they had indeed heard a cuckoo. If a tool such as CrowdMap (https://crowdmap.com/) was used to filter the tweet’s with #heardacuckoo in them and verified by experts, could the conversion rate from publicity to actual record be higher?

5. The Shifting Paradigm: From “Knowledge-Driven” Analysis and Hypothesis Testing to “Data-Driven” Analysis

With the advent of the Web 2.0 world and the increase of the “citizen sensor network,” there is a shifting paradigm from “knowledge-driven” analysis created by hypothesis-driven research to “data-driven” analysis, moving studies into more data-intensive science area [44, 83]. This is resulting in a new synthesis of disciplinary areas as new methods of analysis emerge to explore and identify interesting patterns that may not already be apparent; this is particularly prevalent when looking at data gathered over large spatial and temporal scales [44, 88, 89]. This approach offers valuable insights enabling further hypothesis for the discovery of underlying ecological processes. With such large datasets with such varying attributes; it is no wonder that all disciplines of science are seemingly beginning to merge into computer science as it enables scientists in varying fields to better understand complex systems [83, 9093]. In order to better utilise citizen science collected datasets that provide a wide range of data over long periods, many researchers are moving into intelligent analysis. This may involve using novel probabilistic machine-learning statistical analysis in the form of computational modelling, or methods of analysis which include Bayesian or neural networking methods [90, 91, 93, 94]. Indeed, Link et al. [89] utilised a hierarchical model and Bayesian analyses to account for variations in effort on counts and to provide summaries over large geographic areas for a complex dataset provided by the Christmas Bird Count in America. They successfully revealed regional patterns of population change, which was then shown to be similar to data shown by the Midwinter Waterfowl Inventory in the US [89].

Currently, databases of species information are often disjunct, outdated, and incomplete, and data recording methods are often not standardised across organisational databases making reconciling datasets from different sources for studies often unreliable. This makes large scale data collection a necessity for research and the use of more complex methods of data collection an ever growing and underdemand area of study.

6. Conclusion

In our increasingly changing and evolving technological world, the presence of citizen scientists or citizen sensors who can contribute to science in more meaningful ways is allowing the rapid expansion of citizen science. Monitoring, anticipating, and mitigating large-scale threats to our biodiversity and natural world have also never been more prominent than they are now. In an increasingly urbanised world, successful monitoring of the environment is needed in the face of continuing climate and land-use change and the need to increase understanding of key ecological and environmental processes.

Citizen science and the exploitation of citizen science and sensor networks are probably one of the most important factors in being able to achieve this. The data is out there, just waiting to be understood, almost each and every person in the developed world and beyond has the potential to contribute to our understanding in a meaningful way. With the rapid progression of technology it is within our capabilities to begin this journey of understanding. It is, however, important to recognise the potential weaknesses that can result from poorly managing datasets and to pre-empt how the data is likely to be used and integrated beyond the original scope of the project.

It is also prudent to note something that many conservation organisations are realizing; a need to interest new generations of naturalists and enthusiasts as current recorders is an aging group with limited recruitment. By exploiting new technologies to aid recruitment of a younger generation of recorders and naturalists and educate an increasingly urbanised population, it will benefit all stake holders.

If citizen science was commonplace, how much more scientific knowledge could we discover? And in this world where people are increasingly divorced from the natural environment, how much would this influence decision making, education, and scientific thinking?

Acknowledgment

Many thanks to the University of Gloucestershire for the studentship which is supporting the author in this work. Figure 2(b) reproduced with kind permission of Springer Science and Business Media [45].