论文信息 - Using Deep Convolutional Neural Network Architectures for Object Classification and Detection Within X-Ray Baggage Security Imagery

Using Deep Convolutional Neural Network Architectures for Object Classification and Detection Within X-Ray Baggage Security Imagery

We consider the use of deep convolutional neural networks (CNNs) with transfer learning for the image classification and detection problems posed within the context of X-ray baggage security imagery. The use of the CNN approach requires large amounts of data to facilitate a complex end-to-end feature extraction and classification process. Within the context of X-ray security screening, limited availability of object of interest data examples can thus pose a problem. To overcome this issue, we employ a transfer learning paradigm such that a pre-trained CNN, primarily trained for generalized image classification tasks where sufficient training data exists, can be optimized explicitly as a later secondary process towards this application domain. To provide a consistent feature-space comparison between this approach and traditional feature space representations, we also train support vector machine (SVM) classifier on CNN features. We empirically show that fine-tuned CNN features yield superior performance to conventional hand-crafted features on object classification tasks within this context. Overall we achieve 0.994 accuracy based on AlexNet features trained with SVM classifier. In addition to classification, we also explore the applicability of multiple CNN driven detection paradigms, such as sliding window-based CNN (SW-CNN), Faster region-based CNNs (F-RCNNs), region-based fully convolutional networks (R-FCN), and YOLOv2. We train numerous networks tackling both single and multiple detections over SW-CNN/ F-RCNN/R-FCN/YOLOv2 variants. YOLOv2, Faster-RCNN, and R-FCN provide superior results to the more traditional SW-CNN approaches. With the use of YOLOv2, using input images of size $544\times 544$ , we achieve 0.885 mean average precision (mAP) for a six-class object detection problem. The same approach with an input of size $416\times 416$ yields 0.974 mAP for the two-class firearm detection problem and requires approximately 100 ms per image. Overall we illustrate the comparative performance of these techniques and show that object localization strategies cope well with cluttered X-ray security imagery, where classification techniques fail.

[1] Ramprasaath R. Selvaraju,et al. Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[2] D Mery,et al. Object recognition in X-ray testing using an efficient search algorithm in multiple views , 2017 .

[3] Samet Akcay,et al. On using feature descriptors as visual words for object detection within X-ray baggage security screening , 2016, ICDP.

[4] T. P. Breckon,et al. Improving feature-based object recognition for X-ray baggage security screening using primed visualwords , 2013, 2013 IEEE International Conference on Industrial Technology (ICIT).

[5] Domingo Mery,et al. Modern Computer Vision Techniques for X-Ray Testing in Baggage Inspection , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[6] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[7] Stefan Roth,et al. Object Detection in Multi-view X-Ray Images , 2012, DAGM/OAGM Symposium.

[8] Domingo Mery,et al. Object Recognition in Baggage Inspection Using Adaptive Sparse Representations of X-ray Images , 2015, PSIVT.

[9] Gal Chechik,et al. Object separation in x-ray image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[11] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[12] Toby P. Breckon,et al. Transfer learning using convolutional neural networks for object classification within X-ray baggage security imagery , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[13] Lewis D. Griffin,et al. Automated X-ray Image Analysis for Cargo Security: Critical Review and Future Promise , 2016, Journal of X-ray science and technology.

[14] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Thomas M. Breuel,et al. Visual cortex inspired features for object detection in X-ray images , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[16] Maneesha Singh,et al. Explosives detection systems (EDS) for aviation security , 2003, Signal Process..

[17] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Toby P. Breckon,et al. Object classification in 3D baggage security computed tomography imagery using visual codebooks , 2015, Pattern Recognit..

[20] Najla Megherbi Bouallagu,et al. A comparison of 3D interest point descriptors with application to airport baggage object detection in complex CT imagery , 2013, Pattern Recognit..

[21] Sameer Singh,et al. Image segmentation optimisation for X-ray images of airline luggage , 2004, Proceedings of the 2004 IEEE International Conference on Computational Intelligence for Homeland Security and Personal Safety, 2004. CIHSPS 2004..

[22] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[23] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] Muhammet Bastan,et al. Visual Words on Baggage X-Ray Images , 2011, CAIP.

[25] Qiang Lu,et al. Using Image Processing Methods to Improve the Explosive Detection Accuracy , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[26] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28] Toby P. Breckon,et al. Materials-based 3D segmentation of unknown objects from dual-energy computed tomography imagery in baggage security screening , 2015, Pattern Recognit..

[29] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[30] Muhammet Bastan,et al. Multi-view object detection in dual-energy X-ray images , 2015, Machine Vision and Applications.

[31] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] B.R. Abidi,et al. Improving Weapon Detection in Single Energy X-Ray Images Through Pseudocoloring , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[33] Ivan Laptev,et al. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34] Domingo Mery,et al. Automated detection in complex objects using a tracking algorithm in multiple X-ray views , 2011, CVPR 2011 WORKSHOPS.

[35] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Andre Mouton,et al. A review of automated image understanding within 3D baggage computed tomography security screening. , 2015, Journal of X-ray science and technology.

[37] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[39] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[40] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[41] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[42] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[43] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[44] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[45] Muhammet Bastan,et al. Object Recognition in Multi-View Dual Energy X-ray Images , 2013, BMVC.

[46] Mongi A. Abidi,et al. A Combinational Approach to the Fusion, De-noising and Enhancement of Dual-Energy X-Ray Luggage Images , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[47] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[49] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.