Data Enrichment in Fine-Grained Classification of Aquatic Macroinvertebrates

The types and numbers of benthic macroinvertebrates found in a water body reflect water quality. Therefore, macroinvertebrates are routinely monitored as a part of freshwater ecological quality assessment. The collected macroinvertebrate samples are identified by human experts, which is costly and time-consuming. Thus, developing automated identification methods that could partially replace the human effort is important. In our group, we have been working toward this goal and, in this paper, we improve our earlier results on automated macroinvertebrate classification obtained using deep Convolutional Neural Networks (CNNs). We apply simple data enrichment prior to CNN training. By rotations and mirroring, we create new images so as to increase the total size of the image database sixfold. We evaluate the effect of data enrichment on Caffe and MatConvNet CNN implementations. The networks are trained either fully on the macroinvertebrate data or first pretrained using ImageNet pictures and then fine-tuned using the macroinvertebrate data. The results show 3-6% improvement, when the enriched data are used. This is an encouraging result, because it significantly narrows the gap between automated techniques and human experts, while it leaves room for future improvements as even the size of the enriched data, about 60000 images, is small compared to data sizes typically required for efficient training of deep CNNs.

[1]  Jonathan Krause,et al.  Fine-grained recognition without part annotations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Thomas G. Dietterich,et al.  Automated processing and identification of benthic invertebrate samples , 2010, Journal of the North American Benthological Society.

[3]  Martti Juhola,et al.  Evaluating the performance of artificial neural networks for the classification of freshwater benthic macroinvertebrates , 2014, Ecol. Informatics.

[4]  Hansang Lee,et al.  Plankton classification on imbalanced large scale database via convolutional neural networks with transfer learning , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[5]  Trevor Darrell,et al.  Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[6]  Anne Courrat,et al.  Three hundred ways to assess Europe's surface waters: An almost complete overview of biological methods to implement the Water Framework Directive , 2012 .

[7]  Jonathan Krause,et al.  Learning Features and Parts for Fine-Grained Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[8]  Yuqing Lin,et al.  Modelling the presence and identifying the determinant factors of dominant macroinvertebrate taxa in a karst river , 2016, Environmental Monitoring and Assessment.

[9]  Anastasios Tefas,et al.  Improving subspace learning for facial expression recognition using person dependent and geometrically enriched training sets , 2011, Neural Networks.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[12]  Martti Juhola,et al.  Classification and retrieval on macroinvertebrate image databases , 2011, Comput. Biol. Medicine.

[13]  Alexandros Iosifidis,et al.  Learned vs. engineered features for fine-grained classification of aquatic macroinvertebrates , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[14]  K. Meissner,et al.  Breaking the curse of dimensionality in quadratic discriminant analysis models with a novel variant of a Bayes classifier enhances automated taxa identification of freshwater macroinvertebrates , 2013 .

[15]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[17]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[18]  Phil F. Culverhouse,et al.  Human and machine factors in algae monitoring performance , 2007, Ecol. Informatics.

[19]  Peter Haase,et al.  First audit of macroinvertebrate samples from an EU Water Framework Directive monitoring program: human error greatly lowers precision of assessment results , 2010, Journal of the North American Benthological Society.

[20]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[21]  Pierre Bonnet,et al.  Plant Identification in an Open-world (LifeCLEF 2016) , 2016, CLEF.