Plant Classification Based on Gated Recurrent Unit

Classification of plants based on a multi-organ approach is very challenging due to the variability in shape and appearance in plant organs. Despite promising solutions built using convolutional neural network (CNN) for plant classification, the existing approaches do not consider the correspondence between different views captured of a plant. In fact, botanists usually observe and study simultaneously a plant from different vintage points, as a whole and also analyse different organs in order to disambiguate species. Driven by this insight, we introduce a new framework for plant structural learning using the recurrent neural network (RNN) approach. This novel approach supports classification based on a varying number of plant views composed of one or more organs of a plant, by optimizing the dependencies between them. We also present the qualitative results of our proposed models by visualizing the learned attention maps. To our knowledge, this is the first study to venture into such dependencies modeling and interpret the respective neural net for plant classification. Finally, we show that our proposed method outperforms the conventional CNN approach on the PlantClef2015 benchmark. The source code and models are available at https://github.com/cs-chan/Deep-Plant.

[1]  Chee Seng Chan,et al.  LifeClef 2017 Plant Identification Challenge: Classifying Plants using Generic-Organ Correlation Features , 2017, CLEF.

[2]  Juan C. Caicedo,et al.  Fine-tuning Deep Convolutional Networks for Plant Recognition , 2015, CLEF.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Mostafa Mehdipour-Ghazi,et al.  Sabanci-Okan System in LifeCLEF 2015 Plant Identification Competition , 2015, CLEF.

[5]  Richard Socher,et al.  Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[6]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Peter I. Corke,et al.  Content Specific Feature Learning for Fine-Grained Plant Classification , 2015, CLEF.

[8]  Richard S. Zemel,et al.  End-to-End Instance Segmentation with Recurrent Attention , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[10]  Dumitru Erhan,et al.  Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Paolo Remagnino,et al.  HGO-CNN: Hybrid generic-organ convolutional neural network for multi-organ plant classification , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[12]  Alexander J. Smola,et al.  Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Marcus Liwicki,et al.  Scene labeling with LSTM recurrent neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yoshua Bengio,et al.  Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks , 2015, IEEE Transactions on Multimedia.

[15]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Peter I. Corke,et al.  Subset feature learning for fine-grained category classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Philip H. S. Torr,et al.  Recurrent Instance Segmentation , 2015, ECCV.

[18]  Wei Xu,et al.  Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Berrin A. Yanikoglu,et al.  Sabanci-Okan System at ImageClef 2013 Plant Identification Competition , 2013, CLEF.

[20]  Hervé Glotin,et al.  LifeCLEF 2016: Multimedia Life Species Identification Challenges , 2016, CLEF.

[21]  Itheri Yahiaoui,et al.  Interactive plant identification based on social image data , 2014, Ecol. Informatics.

[22]  Alexis Joly,et al.  PlantNet Participation at LifeCLEF2014 Plant Identification Task , 2014, CLEF.

[23]  Dávid Papp,et al.  Viewpoints Combined Classification Method in Image-based Plant Identification Task , 2014, CLEF.

[24]  Kate Saenko,et al.  Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.

[25]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[26]  Paolo Remagnino,et al.  How deep learning extracts and learns leaf features for plant classification , 2017, Pattern Recognit..

[27]  Paolo Remagnino,et al.  Multi-Organ Plant Classification Based on Convolutional and Recurrent Neural Networks , 2018, IEEE Transactions on Image Processing.

[28]  Sungbin Choi Plant Identification with Deep Convolutional Neural Network: SNUMedinfo at LifeCLEF Plant Identification Task 2015 , 2015, CLEF.

[29]  Xi Wang,et al.  Multi-Stream Multi-Class Fusion of Deep Networks for Video Classification , 2016, ACM Multimedia.

[30]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[31]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[32]  J. Mothe,et al.  LifeCLEF 2015 : Multimedia Life Species Identification Challenges , 2014 .

[33]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Gang Wang,et al.  DAG-Recurrent Neural Networks for Scene Labeling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Fei Sha,et al.  Aligning Where to See and What to Tell: Image Captioning with Region-Based Attention and Scene-Specific Contexts , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Andy Clark,et al.  Are we predictive engines? Perils, prospects, and the puzzle of the porous perceiver. , 2013, The Behavioral and brain sciences.

[37]  Koray Kavukcuoglu,et al.  Multiple Object Recognition with Visual Attention , 2014, ICLR.

[38]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.