Compact Deep Neural Networks for Device-Based Image Classification

Convolutional Neural Network (CNN) is efficient in learning hierarchical features from large image datasets, but its model complexity and large memory footprints prevent it from being deployed to devices without a server back-end support. Modern CNNs are always trained on GPUs or even GPU clusters with high-speed computation capability due to the immense size of the network. A device-based deep learning CNN engine for image classification can be very useful for situations where server back end is either not available, or its communication link is weak and unreliable. Methods on regulating the size of the network, on the other hand, are rarely studied. In this chapter we present a novel compact architecture that minimizes the number and complexity of lower level filters in a CNN by separating the color information from the original image. A 9-patch histogram extractor is built to exploit the unused color information. A high-level classifier is then used to learn the features obtained from the compact CNN that was trained only on grayscale image with limited number of filters and the 9-patch histogram extracted from the color information in the image. We apply our compact architecture to Samsung Mobile Image Dataset for image classification. The proposed solution has a recognition accuracy on par with the state-of-the-art CNNs, while achieving significant reduction in model memory footprint. With these advantages, our system is being deployed to the mobile devices.

[1]  Yann LeCun,et al.  Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[2]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[3]  Patrick Haffner,et al.  Support vector machines for histogram-based image classification , 1999, IEEE Trans. Neural Networks.

[4]  Li Deng,et al.  A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Ramin Zabih,et al.  Histogram refinement for content-based image retrieval , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[10]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[13]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[14]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.