Submit a preprint


A pipeline for assessing the quality of images and metadata from crowd-sourced databases.use asterix (*) to get italics
Jackie BillottePlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
<p style="text-align: justify;">Crowd-sourced biodiversity databases provide easy access to data and images for ecological education and research. One concern with using publicly sourced databases; however, is the quality of their images, taxonomic descriptions, and geographical metadata. The method presented in this paper attempts to address this concern using a suite of pipelines to evaluate taxonomic consistency, how well geo-tagging fits known distributions, and the image quality of crowd-sourced data acquired from iNaturalist, a crowd-sourced biodiversity database. Additionally, it provides researchers that use these datasets to report a quantifiable assessment of the taxonomic consistency. The pipeline allows users to analyze multiple images from iNaturalist and their associated metadata; to determine the level of taxonomic identification (family, genera, species) for each occurrence; whether the taxonomy label for an image matches accepted nesting of families, genera, and species; and whether geo-tags match the distribution of the taxon described using occurrence data from the Global Biodiversity Infrastructure Facility (GBIF) as a reference. Additionally, image quality is assessed using BRISQUE, an algorithm that allows for image quality evaluation without a reference photo. Entries from the order Araneae (spiders) are used as a case study. Overall, the results suggest that iNaturalist can provide large metadata and image sets for research. Given the inevitability of some low-quality observations, this pipeline provides a valuable resource for researchers and educators to evaluate the quality of iNaturalist and other crowd-sourced data.</p> should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https:// should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https:// should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
biodiversity, iNaturalist, GBIF, metadata, pipeline, database, community science
NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).
Arachnids, Biodiversity, Biology, Conservation biology, Ecology, Insecta, Invertebrates
Sean Ryan,, Thomas Sappington,, Johanne Brunet,, Bonnie Blaimer,, Matthias Foellmer,, Alessandro Cini,, Chris Jiggins,, Clive Hambler,, George Roderick, No need for them to be recommenders of PCI Zool. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
e.g. John Doe []
2022-05-03 00:18:23
Matthias Foellmer