Automated Diatom Classification (Part B): A Deep Learning Approach

Diatoms, a kind of algae microorganisms with several species, are quite useful for water quality determination, one of the hottest topics in applied biology nowadays. At the same time, deep learning and convolutional neural networks (CNN) are becoming an extensively used technique for image classification in a variety of problems. This paper approaches diatom classification with this technique, in order to demonstrate whether it is suitable for solving the classification problem. An extensive dataset was specifically collected (80 types, 100 samples/type) for this study. The dataset covers different illumination conditions and it was computationally augmented to more than 160,000 samples. After that, CNNs were applied over datasets pre-processed with different image processing techniques. An overall accuracy of 99% is obtained for the 80-class problem and different kinds of images (brightfield, normalized). Results were compared to previous presented classification techniques with different number of samples. As far as the authors know, this is the first time that CNNs are applied to diatom classification.

[1]  Saso Dzeroski,et al.  Hierarchical classification of diatom images using ensembles of predictive clustering trees , 2012, Ecol. Informatics.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  H. D. Buf,et al.  Automatic diatom identification , 2002 .

[4]  Rafael C. González,et al.  Local Determination of a Moving Contrast Edge , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[6]  Gabriel Cristóbal,et al.  Automated Diatom Classification (Part A): Handcrafted Feature Approaches , 2017 .

[7]  Kenneth K Y Wong,et al.  High-throughput time-stretch imaging flow cytometry for multi-class classification of phytoplankton. , 2016, Optics express.

[8]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[9]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  E. Bécares,et al.  Are biotic indices sensitive to river toxicants? A comparison of metrics based on diatoms and macro-invertebrates. , 2010, Chemosphere.

[12]  Janice L. Pappas,et al.  Legendre shape descriptors and shape group determination of specimens in the Cymbella cistula species complex , 2003 .

[13]  Eileen J. Cox,et al.  The Diatoms: Applications for the Environmental and Earth Sciences , 2012 .