An Optimized Implementation of Speech Recognition Combining GPU with Deep Belief Network for IoT

With the advancement in Internet of Things (Iot), the speech recognition technology in mobile terminals’ applications has become a new trend. Consequently, how to accelerate the training and improve the accuracy in speech recognition has attracted the attention of academia and industry. Generally, Deep Belief Network (DBN) with Graphic Processing Unit (GPU) is applied in acoustic model of speech recognition, critical research challenges are yet to be solved. It’s hard for GPU to store the parameters of DBN at one time as well as GPU’s shared memory is not fully used. And parameters transmission have become a bottleneck in multi-GPUs. This paper presents a new method in which the weight matrix is divided into sub-weight matrices and established a reasonable memory model. To eliminate the inefficient idle-state during data transfers, a stream process model is proposed in which the data transfer and kernel execution are performed simultaneously. Further, apply the optimized single GPU implementation to multi-GPUs and is intend to solve the parameters transmission. Experimental results show the optimized GPU implementation without violating the size limitation of GPU’s memory.

[1]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[2]  Dong Yu,et al.  Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[3]  Noel Lopes,et al.  Towards adaptive learning with improved convergence of deep belief networks on graphics processing units , 2014, Pattern Recognit..

[4]  Rajat Raina,et al.  Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.

[5]  Yong Zhou,et al.  Optimization and Analysis of Parallel Back Propagation Neural Network on GPU Using CUDA , 2015, ICONIP.

[6]  Li Deng,et al.  Deep Dynamic Models for Learning Hidden Representations of Speech Features , 2014 .

[7]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[8]  Dai Lirong Fast training algorithm for deep neural network using multiple GPUs , 2013 .

[9]  Xihong Wu,et al.  GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Nando de Freitas,et al.  A tutorial on stochastic approximation algorithms for training Restricted Boltzmann Machines and Deep Belief Nets , 2010, 2010 Information Theory and Applications Workshop (ITA).

[11]  Tara N. Sainath,et al.  Making Deep Belief Networks effective for large vocabulary continuous speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[12]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[13]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.