On-Device Image Classification with Proxyless Neural Architecture Search and Quantization-Aware Fine-Tuning

It is challenging to efficiently deploy deep learning models on resource-constrained hardware devices (e.g., mobile and IoT devices) with strict efficiency constraints (e.g., latency, energy consumption). We employ Proxyless Neural Architecture Search (ProxylessNAS) to auto design compact and specialized neural network architectures for the target hardware platform. ProxylessNAS makes latency differentiable, so we can optimize not only accuracy but also latency by gradient descent. Such direct optimization saves the search cost by 200x compared to conventional neural architecture search methods. Our work is followed by quantization-aware fine-tuning to further boost efficiency. In the Low Power Image Recognition Competition at CVPR'19, our solution won the 3rd place on the task of Real-Time Image Classification (online track).

[1]  Song Han,et al.  AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.

[2]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[3]  Bo Chen,et al.  NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications , 2018, ECCV.

[4]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[5]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[6]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[9]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Song Han,et al.  Point-Voxel CNN for Efficient 3D Deep Learning , 2019, NeurIPS.

[11]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[12]  Song Han,et al.  Path-Level Network Transformation for Efficient Architecture Search , 2018, ICML.

[13]  Zhijian Liu,et al.  HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Chuang Gan,et al.  Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[15]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[16]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[17]  Yong Yu,et al.  Efficient Architecture Search by Network Transformation , 2017, AAAI.