Infrared object classification with a hybrid optical convolution neural network.

Recent advancements in machine vision have enabled a great range of applications from image classification to autonomous driving. However, there is still a dilemma between the pursuit of higher-resolution training images that require a detector array with more pixels on the front end, and the demands on acquisition for embedded systems restrained by power, transmission bandwidth, and storage. In this paper, a multi-pixel hybrid optical convolutional neural network machine vision system was designed and validated to perform high-speed infrared object detection. The proposed system replicates the front convolution layer in a convolutional neural network utilizing a high-speed digital micro-mirror device to display the first layer of kernels at a resolution greater than the subsequent detector. After this, further convolutions are carried out in software to perform the object recognition. An infrared vehicle dataset was used to validate the performance of the hybrid system through simulation. We also tested this in hardware by performing infrared classification on toy vehicles to showcase the feasibility of such a design.

[1]  Tal Hassner,et al.  Deep Face Recognition: A Survey , 2018, 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI).

[2]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Martín Abadi,et al.  TensorFlow: learning functions at scale , 2016, ICFP.

[6]  Matt Weldon,et al.  A high-resolution SWIR camera via compressed sensing , 2012, Defense + Commercial Sensing.

[7]  H. Andrews,et al.  Hadamard transform image coding , 1969 .

[8]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[9]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[10]  Luca Maria Gambardella,et al.  Max-pooling convolutional neural networks for vision-based hand gesture recognition , 2011, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA).