Detection of tomato organs based on convolutional neural network under the overlap and occlusion backgrounds

Traditional detection methods are not sensitive to small-sized tomato organs (flowers and fruits), because the immature green tomatoes are highly similar to the background color. The overlap among fruits and the occlusion of stems and leaves on tomato organs can lead to false and missing detection, which decreases the accuracy and generalization ability of the model. Therefore, a tomato organ recognition method based on improved Feature Pyramid Network was proposed in this paper. To begin with, multi-scale feature fusion was used to fuse the detailed bottom features and high-level semantic features to detect small-sized tomato organs to improve recognition rate. And then repulsion loss was used to take place of the original smooth L 1 loss function. Besides, Soft-NMS (Soft non-maximum suppression) was adopted to replace non-maximum suppression to screen the bounding boxes of tomato organs to construct a recognition model of tomato key organ. Finally, the network was trained and verified on the collected image data set. The results showed that compared with the traditional Faster R-CNN model, the performance was greatly improved (mean average precision was improved from 90.7 to 99.5%). Subsequently, the training model can be compressed so that it can be embedded into the microcontroller to develop further precise pesticide targeting application system of tomato organs and the automatic picking device.

[1]  C. Lavergne A Jackknife Method for Estimation of Variance Components , 1995 .

[2]  Yaxiang Fan,et al.  Accurate non-maximum suppression for object detection in high-resolution remote sensing images , 2018 .

[3]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[4]  Takio Kurita,et al.  Mixture of counting CNNs , 2018, Machine Vision and Applications.

[5]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[6]  Lin Zhang,et al.  Convolutional neural networks based on multi-scale additive merging layers for visual smoke recognition , 2018, Machine Vision and Applications.

[7]  Deyu Meng,et al.  Co-Saliency Detection via a Self-Paced Multiple-Instance Learning Framework , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Tristan Perez,et al.  DeepFruits: A Fruit Detection System Using Deep Neural Networks , 2016, Sensors.

[9]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[10]  Sanyuan Zhao,et al.  Scene text recognition using residual convolutional recurrent neural network , 2018, Machine Vision and Applications.

[11]  Bo Du,et al.  Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding , 2015, Pattern Recognit..

[12]  Xiaobo Lu,et al.  Driving behaviour recognition from still images by using multi-stream fusion CNN , 2018, Machine Vision and Applications.

[13]  Lei Guo,et al.  Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Guoyin Wang,et al.  Pixel convolutional neural network for multi-focus image fusion , 2017, Inf. Sci..

[15]  Seishi Ninomiya,et al.  On Plant Detection of Intact Tomato Fruits Using Image Analysis and Machine Learning Methods , 2014, Sensors.

[16]  Lambert Schomaker,et al.  Hyperspectral demosaicking and crosstalk correction using deep learning , 2018, Machine Vision and Applications.

[17]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[18]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[20]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[21]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Yuning Jiang,et al.  Repulsion Loss: Detecting Pedestrians in a Crowd , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  E. Paran,et al.  P-333: Effect of tomato's lycopene on blood pressure, serum lipoproteins, plasma homocysteine and oxidative sress markers in grade I hypertensive patients , 2001 .

[24]  Jiang Huanyu,et al.  Recognizing and locating ripe tomatoes based on binocular stereovision technology , 2008 .

[25]  Myoungho Sunwoo,et al.  Semantic segmentation-based parking space detection with standalone around view monitoring system , 2018, Machine Vision and Applications.

[26]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Bin Yu,et al.  Superheat: An R Package for Creating Beautiful and Extendable Heatmaps for Visualizing Complex Data , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[29]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Li Cheng,et al.  Too Far to See? Not Really!—Pedestrian Detection With Scale-Aware Localization Policy , 2017, IEEE Transactions on Image Processing.