FPGA Acceleration of Recurrent Neural Network Based Language Model

Recurrent neural network (RNN) based language model (RNNLM) is a biologically inspired model for natural language processing. It records the historical information through additional recurrent connections and therefore is very effective in capturing semantics of sentences. However, the use of RNNLM has been greatly hindered for the high computation cost in training. This work presents an FPGA implementation framework for RNNLM training acceleration. At architectural level, we improve the parallelism of RNN training scheme and reduce the computing resource requirement for computation efficiency enhancement. The hardware implementation primarily targets at reducing data communication load. A multi-thread based computation engine is utilized which can successfully mask the long memory latency and reuse frequent accessed data. The evaluation based on the Microsoft Research Sentence Completion Challenge shows that the proposed FPGA implementation outperforms traditional class-based modest-size recurrent networks and obtains 46.2% in training accuracy. Moreover, experiments at different network sizes demonstrate a great scalability of the proposed framework.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  K. M. Curtis,et al.  Piecewise linear approximation applied to nonlinear function of a neural network , 1997 .

[3]  Frederick Jelinek,et al.  A study of n-gram and decision tree letter language modeling methods , 1998, Speech Commun..

[4]  Jun Wu,et al.  A maximum entropy language model integrating N-grams and topic dependencies for conversational speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[6]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[7]  Yutaka Maeda,et al.  Simultaneous perturbation learning rule for recurrent neural networks and its FPGA implementation , 2005, IEEE Transactions on Neural Networks.

[8]  Paul Chow,et al.  A high-performance FPGA architecture for restricted boltzmann machines , 2009, FPGA '09.

[9]  Johan Schalkwyk,et al.  Query language modeling for voice search , 2010, 2010 IEEE Spoken Language Technology Workshop.

[10]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Dan Klein,et al.  Faster and Smaller N-Gram Language Models , 2011, ACL.

[12]  Lukás Burget,et al.  Empirical Evaluation and Combination of Advanced Language Modeling Techniques , 2011, INTERSPEECH.

[13]  Lukás Burget,et al.  Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[14]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[15]  Geoffrey Zweig,et al.  A Challenge Set for Advancing Language Modeling , 2012, WLM@NAACL-HLT.

[16]  Vysoké Učení,et al.  Statistical Language Models Based on Neural Networks , 2012 .

[17]  Luis Ceze,et al.  Neural Acceleration for General-Purpose Approximate Programs , 2014, IEEE Micro.

[18]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[19]  Geoffrey Zweig,et al.  Recent advances in deep learning for speech research at Microsoft , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Qing Wu,et al.  A Parallel Neuromorphic Text Recognition System and Its Implementation on a Heterogeneous High-Performance Computing Cluster , 2013, IEEE Transactions on Computers.

[21]  Marcin Pietras Hardware conversion of neural networks simulation models for neural processing accelerator implemented as FPGA-based SoC , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[22]  Ming Zhou,et al.  A Recursive Recurrent Neural Network for Statistical Machine Translation , 2014, ACL.

[23]  Yu Wang,et al.  Large scale recurrent neural network on GPU , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[24]  Endong Wang,et al.  Intel Math Kernel Library , 2014 .

[25]  Georg Heigold,et al.  Sequence discriminative distributed training of long short-term memory recurrent neural networks , 2014, INTERSPEECH.