Automated Identification of Herbarium Specimens at Different Taxonomic Levels

The estimated number of flowering plant species on Earth is around 400,000. In order to classify all known species via automated image-based approaches, current datasets of plant images will have to become considerably larger. To achieve this, some authors have explored the possibility of using herbarium sheet images. As the plant datasets grow and start reaching the tens of thousands of classes, unbalanced datasets become a hard problem. This causes models to be inaccurate for certain species due to intra- and inter-specific similarities. Additionally, automatic plant identification is intrinsically hierarchical. In order to tackle this problem of unbalanced datasets, we need ways to classify and calculate the loss of the model by taking into account the taxonomy, for example, by grouping species at higher taxon levels. In this research we compare several architectures for automatic plant identification, taking into account the plant taxonomy to classify not only at the species level, but also at higher levels, such as genus and family.

[1]  Erick Mata-Montero,et al.  Automated Plant Species Identification: Challenges and Opportunities , 2016, WITFOR.

[2]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Alexis Joly,et al.  Automated Herbarium Specimen Identification using Deep Learning , 2017 .

[5]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[6]  W. John Kress,et al.  Leafsnap: A Computer Vision System for Automatic Plant Species Identification , 2012, ECCV.

[7]  Alexis Joly,et al.  LifeCLEF Plant Identification Task 2014 , 2014, CLEF.

[8]  P. Bonnet,et al.  Going deeper in the automated identification of Herbarium specimens , 2017, BMC Evolutionary Biology.

[9]  Hervé Goëau,et al.  A look inside the Pl@ntNet experience , 2015, Multimedia Systems.

[10]  Renato J. O. Figueiredo,et al.  A Computational- and Storage-Cloud for Integration of Biodiversity Collections , 2013, 2013 IEEE 9th International Conference on e-Science.

[11]  Robinson Piramuthu,et al.  HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Yaroslav Bulatov,et al.  Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks , 2013, ICLR.

[13]  Vasant Honavar,et al.  Learning Classifiers Using Hierarchically Structured Class Taxonomies , 2005, SARA.

[14]  Radford M. Neal,et al.  Improving Classification When a Class Hierarchy is Available Using a Hierarchy-Based Prior , 2005, math/0510449.

[15]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[16]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Lawrence M. Page,et al.  Digitization of Biodiversity Collections Reveals Biggest Data on Biodiversity , 2015 .

[18]  Colin Raffel,et al.  Lasagne: First release. , 2015 .