Open world plant image identification based on convolutional neural network

In this paper, we propose several enhancements to the well-known VGG 16-layers Convolutional Neural Network (CNN) model towards open world image classification, by taking plant identification as an example. We first propose to replace the last pooling layer of the VGG 16-layers model with a Spatial Pyramid Pooling layer, enabling the model to accept arbitrary sized input images. Second, for the activation function, we replace Rectified Linear Unit (ReLU) with Parametric ReLU in order to increase the adaptability of parameter learning. In addition, we introduce the Unseen Category Query Identification algorithm to identify and omit images of unseen category, thus preventing false classification into predefined categories. Such algorithm is essential in real life, since there is no guarantee that a given image has to fall into a predefined category. We use the dataset from the LifeCLEF 2016 plant identification task. We compare our results with other participants and demonstrate that our enhanced model with proposed algorithm exhibits outstanding performance.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Alexis Joly,et al.  LifeCLEF Plant Identification Task 2014 , 2014, CLEF.

[3]  Pierre Bonnet,et al.  Plant Identification in an Open-world (LifeCLEF 2016) , 2016, CLEF.

[4]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[6]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).