Towards Quantum Language Models

This paper presents a new approach for building Language Models using the Quantum Probability Theory, a Quantum Language Model (QLM). It mainly shows that relying on this probability calculus it is possible to build stochastic models able to benefit from quantum correlations due to interference and entanglement. We extensively tested our approach showing its superior performances, both in terms of model perplexity and inserting it into an automatic speech recognition evaluation setting, when compared with state-of-the-art language modelling techniques.

[1]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[2]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[3]  Elham Kashefi,et al.  A Quantum-Theoretic Approach to Distributional Semantics , 2013, NAACL.

[4]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[5]  Yann LeCun,et al.  Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs , 2016, ICML.

[6]  Fabio A. González,et al.  Quantum Latent Semantic Analysis , 2011, ICTIR.

[7]  Mark Hasegawa-Johnson,et al.  Semi-supervised training of Gaussian mixture models by conditional entropy minimization , 2010, INTERSPEECH.

[8]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[9]  Xiaohui Zhang,et al.  Improving deep neural network acoustic models using generalized maxout networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Yoshua Bengio,et al.  Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[11]  Carla Lopes,et al.  Phone Recognition on the TIMIT Database , 2012 .

[12]  Hemant D. Tagare,et al.  Notes on Optimization on Stiefel Manifolds , 2011 .

[13]  Lutz Prechelt,et al.  Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.

[14]  Tomas Mikolov,et al.  RNNLM - Recurrent Neural Network Language Modeling Toolkit , 2011 .

[15]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[16]  Marcus Huber,et al.  A composite parameterization of unitary groups, density matrices and subspaces , 2010, 1004.5252.

[17]  Vlatko Vedral,et al.  Introduction to Quantum Information Science , 2006 .

[18]  Sanjeev Khudanpur,et al.  Parallel training of DNNs with Natural Gradient and Parameter Averaging , 2014 .

[19]  Diederik Aerts,et al.  Quantum structure and human thought. , 2013, The Behavioral and brain sciences.

[20]  Ding Liu,et al.  A Novel Classifier Based on Quantum Computation , 2013, ACL.

[21]  Sanjeev Khudanpur,et al.  JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[22]  Massimo Melucci,et al.  Introduction to Information Retrieval and Quantum Mechanics , 2015, The Information Retrieval Series.

[23]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[24]  BurgetLukáš,et al.  The subspace Gaussian mixture model-A structured model for speech recognition , 2011 .

[25]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[26]  C. J. van Rijsbergen,et al.  The Quantum Probability Ranking Principle for Information Retrieval , 2009, ICTIR.

[27]  Mauro Cettolo,et al.  IRSTLM: an open source toolkit for handling large scale language models , 2008, INTERSPEECH.

[28]  Jerome R. Busemeyer,et al.  Quantum Models of Cognition and Decision , 2012 .

[29]  Andrei Khrennikov,et al.  Ubiquitous Quantum Structure: From Psychology to Finance , 2010 .

[30]  David J. Thuente,et al.  Line search algorithms with guaranteed sufficient decrease , 1994, TOMS.

[31]  Isaac L. Chuang,et al.  Quantum Computation and Quantum Information (10th Anniversary edition) , 2011 .

[32]  C. J. van Rijsbergen,et al.  Quantum Mechanics and Information Retrieval , 2011, Advanced Topics in Information Retrieval.

[33]  Jan Cernocký,et al.  Improved feature processing for deep neural networks , 2013, INTERSPEECH.

[34]  Martha Lewis,et al.  Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science: Preface , 2016 .