Fast CNN surveillance pipeline for fine-grained vessel classification and detection in maritime scenarios

Deep convolutional neural networks (CNNs) have proven very effective for many vision benchmarks in object detection and classification tasks. However, the computational complexity and object resolution requirements of CNNs limit their applicability in wide-view video surveillance settings where objects are small. This paper presents a CNN surveillance pipeline for vessel localization and classification in maritime video. The proposed pipeline is build upon the GPU implementation of Fast-R-CNN with three main steps:(1) Vessel filtering and regions proposal using low-cost weak object detectors based on hand-engineered features. (2) Deep CNN features of the candidates regions are computed with one feed-forward pass from the high-level layer of a fine-tuned VGG16 network. (3) Fine-grained classification is performed using CNN features and a support vector machine classifier with linear kernel for object verification. The performance of the proposed pipeline is compared with other popular CNN architectures with respect to detection accuracy and evaluation speed. The proposed approach mAP of 61.10% was the comparable with Fast-R-CNN but with a 10× speed up (on the order of Faster-R-CNN) on the new Annapolis Maritime Surveillance Dataset.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Bernt Schiele,et al.  What Makes for Effective Detection Proposals? , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[5]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[6]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[7]  Yuting Zhang,et al.  Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Fouad Bousetouane,et al.  Off-the-Shelf CNN Features for Fine-Grained Classification of Vessels in a Maritime Environment , 2015, ISVC.

[9]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[12]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[15]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).