论文信息 - POSTER: Pairing Up CNNs for High Throughput Deep Learning

POSTER: Pairing Up CNNs for High Throughput Deep Learning

To facilitate the efficient execution of convolutional neural networks (CNNs) on cloud servers, this paper proposes Yin Yang (YY), an input-driven synergistic deep learning system, which dynamically distributes CNN computation between a complex (Yang) and a simple (Yin) CNN. YY runs most of the inferences on Yin, while Yang is invoked only when Yin has low confidence. On average, compared to the traditional CNN as a service approach, YY improves datacenter throughput by 1.8× and reduces inference latency by 31% on an NVIDIA TITAN X GPU without any accuracy loss across 21 CNNs.

Scott A. Mahlke | Babak Zamirai | Salar Latifi

[1] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[2] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[3] Lior Rokach,et al. Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[4] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.