Comparison of image annotation data generated by multiple investigators for benthic ecology

Multiple investigators often generate data from seabed images within a single image set to reduce the time burden, particularly with the large photographic surveys now available to ecological studies. These data (annotations) are known to vary as a result of differences in investigator opinion on specimen classification, and human factors such as fatigue and cognition. These variations are rarely recorded or quantified, nor are their impacts on derived ecological metrics (density, diversity, composition). We compared the annotations of three investigators of 73 megafaunal morphotypes in ~28,000 images, including 650 common images. Successful annotation was defined as both detecting and correctly classifying a specimen. Estimated specimen detection success was 77%, and classification success was 95%, giving an annotation success rate of 73%. Specimen detection success varied substantially by morphotype (12-100%). Variation in the detection of common taxa resulted in significant differences in apparent faunal density and community composition among investigators. Such bias has the potential to produce spurious ecological interpretations if not appropriately controlled or accounted for. We recommend that photographic studies document the use of multiple annotators, and quantify potential inter-investigator bias. Randomisation of the sampling unit (photograph or video clip) is clearly critical to the effective removal of human annotation bias in multiple annotator studies (and indeed single annotator works).

[1]  Richard S. Lampitt,et al.  The Porcupine Abyssal Plain fixed-point sustained observatory (PAP-SO): variations and trends from the Northeast Atlantic fixed-point time-series , 2012 .

[2]  Tony Rees,et al.  A Standardised Vocabulary for Identifying Benthic Biota and Substrata from Underwater Imagery: The CATAMI Classification Scheme , 2015, PloS one.

[3]  B.M. Schlining,et al.  MBARI's Video Annotation and Reference System , 2006, OCEANS 2006.

[4]  T. Soltwedel,et al.  Bathymetric patterns of megafaunal assemblages from the arctic deep-sea observatory HAUSGARTEN , 2009 .

[5]  B. Bett,et al.  Autonomous Underwater Vehicles (AUVs): Their past, present and future contributions to the advancement of marine geoscience , 2014 .

[6]  J. Gutt,et al.  Semi-Automated Image Analysis for the Assessment of Megafaunal Densities at the Arctic Deep-Sea Observatory HAUSGARTEN , 2012, PloS one.

[7]  E D Megaw,et al.  Factors affecting visual inspection accuracy. , 1979, Applied ergonomics.

[8]  K. R. Clarke,et al.  Change in marine communities : an approach to statistical analysis and interpretation , 2001 .

[9]  G. Rowe,et al.  Pattern and zonation: a study of the bathyal megafauna using the research submersible Alvin , 1975 .

[10]  P. Culverhouse,et al.  Do experts make mistakes? A comparison of human and machine identification of dinoflagellates , 2003 .

[11]  Kenneth W. Gobalet,et al.  A Critique of Faunal Analysis; Inconsistency among Experts in Blind Tests , 2001 .

[12]  Jennifer M. Durden,et al.  Abyssal hills - hidden source of increased habitat heterogeneity, benthic megafaunal biomass and diversity in the deep sea , 2015 .

[13]  Tim W. Nattkemper,et al.  BIIGLE Tools – A Web 2.0 Approach for Visual Bioimage Database Mining , 2009, 2009 13th International Conference Information Visualisation.

[14]  Jennifer M. Durden,et al.  A new method for ecological surveying of the abyss using autonomous underwater vehicle photography , 2014 .

[15]  D. Bailey,et al.  High resolution study of the spatial distributions of abyssal fishes by autonomous underwater vehicle , 2016, Scientific Reports.

[16]  Matthew R. First,et al.  Performance of the human “counting machine”: evaluation of manual microscopy for enumerating plankton , 2012 .

[17]  W. P. Colqtjhoun THE EFFECT OF A SHORT REST-PAUSE ON INSPECTION EFFICIENCY , 1959 .

[18]  A. Rowden,et al.  Submarine canyons: hotspots of benthic biomass and productivity in the deep sea , 2010, Proceedings of the Royal Society B: Biological Sciences.

[19]  Nicola L. Foster,et al.  Quality assurance in the identification of deep-sea taxa from video and image analysis: response to Henry and Roberts , 2014 .

[20]  David S.M. Billett,et al.  Long-term change in the abyssal NE Atlantic: The ‘Amperima Event’ revisited , 2010 .

[21]  A. Magurran,et al.  Measuring Biological Diversity , 2004 .

[22]  Reliable sample sizes for estimating similarity among macroinvertebrate assemblages in tropical streams , 2010 .

[23]  Mark C. Benfield,et al.  An empirical assessment of the consistency of taxonomic identifications , 2014 .