论文信息 - Quaternion Neural Networks for Spoken Language Understanding

Quaternion Neural Networks for Spoken Language Understanding

Machine Learning (ML) techniques have allowed a great performance improvement of different challenging Spoken Language Understanding (SLU) tasks. Among these methods, Neural Networks (NN), or Multilayer Perceptron (MLP), recently received a great interest from researchers due to their representation capability of complex internal structures in a low dimensional subspace. However, MLPs employ document representations based on basic word level or topic-based features. Therefore, these basic representations reveal little in way of document statistical structure by only considering words or topics contained in the document as a “bag-of-words”, ignoring relations between them. We propose to remedy this weakness by extending the complex features based on Quaternion algebra presented in [1] to neural networks called QMLP. This original QMLP approach is based on hyper-complex algebra to take into consideration features dependencies in documents. New document features, based on the document structure itself, used as input of the QMLP, are also investigated in this paper, in comparison to those initially proposed in [1]. Experiments made on a SLU task from a real framework of human spoken dialogues showed that our QMLP approach associated with the proposed document features outperforms other approaches, with an accuracy gain of 2% with respect to the MLP based on real numbers and more than 3% with respect to the first Quaternion-based features proposed in [1]. We finally demonstrated that less iterations are needed by our QMLP architecture to be efficient and to reach promising accuracies.

[1] Gregor Heinrich. Parameter estimation for text analysis , 2009 .

[2] H. R. Harrison,et al. Quaternions and Rotation Sequences: a Primer with Applications to Orbits, Aerospace and Virtual Reality , J. B. Kuipers, Princeton University Press, 41 William Street, Princeton, NJ 08540, USA. 1999. 372pp. Illustrated. £35.00. ISBN 0-691-05872-5. , 1999, The Aeronautical Journal (1968).

[3] Georges Linarès,et al. The LIA Speech Recognition System: From 10xRT to 1xRT , 2007, TSD.

[4] Frédéric Béchet,et al. DECODA: a call-centre human-human spoken conversation corpus , 2012, LREC.

[5] Mohamed Morchid,et al. Theme identification in telephone service conversations using quaternions of speech features , 2013, INTERSPEECH.

[6] Yoshimi Suzuki,et al. Keyword Extraction using Term-Domain Interdependence for Dictation of Radio News , 1998, COLING-ACL.

[7] J.R. Bellegarda,et al. Exploiting latent semantic information in statistical language modeling , 2000, Proceedings of the IEEE.

[8] Tom Minka,et al. Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[9] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10] Mark Steyvers,et al. Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.

[12] Giovanni Muscato,et al. Multilayer Perceptrons to Approximate Quaternion Valued Functions , 1997, Neural Networks.

[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14] Nikos A. Aspragathos,et al. A comparative study of three methods for robot kinematics , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[15] Nobuyuki Matsui,et al. Quaternionic Neural Networks: Fundamental Properties and Applications , 2009 .

[16] Norbert Jankowski,et al. Survey of Neural Transfer Functions , 1999 .

[17] Fuzhen Zhang. Quaternions and matrices of quaternions , 1997 .

[18] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.