Implementation of deep-learning algorithm for obstacle detection and collision avoidance for robotic harvester

Abstract Convolutional neural networks (CNNs) are the current state of the art systems in image semantic segmentation (SS). However, because it requires a large computational cost, it is not suitable for running on embedded devices, such as on rice combine harvesters. In order to detect and identify the surrounding environment for a rice combine harvester in real time, a neural network using Network Slimming to reduce the network model size, which takes wide neural networks as the input model, yielding a compact model (hereafter referred to as “pruned model”) with comparable accuracy, was applied based on an image cascade network (ICNet). Network Slimming performs channel-level sparsity of convolutional layers in the ICNet by imposing L1 regularization on channel scaling factors with the corresponding batch normalization layer, which removes less informative feature channels in the convolutional layers to obtain a more compact model. Then each of the pruned models were evaluated by mean intersection over union (IoU) on the test set. When the compaction ratio is 80%, it gives a 97.4% reduction of model volume size, running 1.33 times faster with comparable accuracy as the original model. The results showed that when the compaction ratio is less than 80%, a more efficient (less computational cost) model with a slightly reduced accuracy in comparison to the original model was achieved. Field tests were conducted with the pruned model (80% compaction ratio) to verify the performance of obstacle detection. Results showed that the average success rate of collision avoidance was 96.6% at an average processing speed of 32.2 FPS (31.1 ms per frame) with an image size of 640 × 480 pixels on a Jetson Xavier. It shows that the pruned model can be used for obstacle detection and collision avoidance in robotic harvesters.

[1]  Gang Yu,et al.  BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[2]  Amritpal Kaur,et al.  A Face Recognition Technique using Local Binary Pattern Method , 2015 .

[3]  Luc Van Gool,et al.  Efficient multi-camera detection, tracking, and identification using a shared set of haar-features , 2011, CVPR 2011.

[4]  Tristan Perez,et al.  Mixtures of Lightweight Deep Convolutional Neural Networks: Applied to Agricultural Robotics , 2017, IEEE Robotics and Automation Letters.

[5]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[6]  Ryohei Masuda,et al.  Using multiple sensors to detect uncut crop edges for autonomous guidance systems of head-feeding combine harvesters , 2014 .

[7]  Andreas Kamilaris,et al.  Deep learning in agriculture: A survey , 2018, Comput. Electron. Agric..

[8]  Weiming Shen,et al.  A new pedestrian detection method based on combined HOG and LSS features , 2015, Neurocomputing.

[9]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[10]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Michihisa Iida,et al.  Image Processing for Ridge/Furrow Discrimination for Autonomous Agricultural Vehicles Navigation , 2013 .

[13]  Michihisa Iida,et al.  Vision-based uncut crop edge detection for automated guidance of head-feeding combine , 2014 .