Word and Class Common Space Embedding for Code-switch Language Modelling

Code-switch language modelling is challenging due to limited linguistic resources and less predictable word sequences. Many state-of-the-art systems rely on linguistic information such as Part-of-Speech (POS) or classes to generalize the lexicon. Such systems generally use multi-task learning or conditional network to improve over baseline RNN language model by providing a better word prediction. To overcome the data sparsity through continuous space modelling and back-off mechanism, we propose to constrain the word and class embedding in a common space by means of cross-lingual word embedding, and to make use of the predicted class embedding as a back-off scheme when word prediction model is weak. The proposed word and class Common Space embedding Language Model (CSLM) is able to model word prediction better and is more robust when only sparse training data are available. The CSLM outperforms the state-of-the-art language model by 9.7% on the code-switch SEAME corpus.

[1]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[2]  Ngoc Thang Vu,et al.  Exploration of the Impact of Maximum Entropy in Recurrent Neural Network Language Models for Code-Switching Speech , 2014, CodeSwitch@EMNLP.

[3]  Pascale Fung,et al.  Language Modeling with Functional Head Constraint for Code Switching Speech Recognition , 2014, EMNLP.

[4]  Haizhou Li,et al.  Improving N-gram language modeling for code-switching speech recognition , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[5]  Christopher D. Manning,et al.  Bilingual Word Representations with Monolingual Quality in Mind , 2015, VS@HLT-NAACL.

[6]  Ngoc Thang Vu,et al.  Challenges of Computational Processing of Code-Switching , 2016, CodeSwitch@EMNLP.

[7]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[8]  Thomas Niesler,et al.  Synthesised bigrams using word embeddings for code-switched ASR of four South African language pairs , 2019, Comput. Speech Lang..

[9]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[10]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[11]  Christopher D. Manning,et al.  Optimizing Chinese Word Segmentation for Machine Translation Performance , 2008, WMT@ACL.

[12]  Pascale Fung,et al.  Code-Switching Language Modeling using Syntax-Aware Multi-Task Learning , 2018, CodeSwitch@ACL.

[13]  Haizhou Li,et al.  A review of the mandarin-english code-switching corpus: SEAME , 2017, 2017 International Conference on Asian Language Processing (IALP).

[14]  Jörg Tiedemann,et al.  OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[15]  Sumeet Singh,et al.  Language Informed Modeling of Code-Switched Text , 2018, CodeSwitch@ACL.

[16]  Peter Auer,et al.  From codeswitching via language mixing to fused lects , 1999 .

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[19]  David A. van Leeuwen,et al.  Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech , 2018, INTERSPEECH.

[20]  Haizhou Li,et al.  A first speech recognition system for Mandarin-English code-switch conversational speech , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  David A. van Leeuwen,et al.  Code-switching detection using multilingual DNNS , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[22]  Haizhou Li,et al.  Recurrent neural network language modeling for code switching conversational speech , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Laura Kallmeyer,et al.  Multilingual Code-switching Identification via LSTM Recurrent Neural Networks , 2016, CodeSwitch@EMNLP.

[24]  Yang Liu,et al.  Learning to Predict Code-Switching Points , 2008, EMNLP.

[25]  Ngoc Thang Vu,et al.  Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling , 2013, ACL.

[26]  Ngoc Thang Vu,et al.  Syntactic and Semantic Features For Code-Switching Factored Language Models , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[27]  Ying Li,et al.  Improved mixed language speech recognition using asymmetric acoustic model and language model with code-switch inversion constraints , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  Thomas Niesler,et al.  Synthesising isiZulu-English Code-Switch Bigrams Using Word Embeddings , 2017, INTERSPEECH.