论文信息 - Multimodal neural pronunciation modeling for spoken languages with logographic origin - 字舞流文

Multimodal neural pronunciation modeling for spoken languages with logographic origin

Graphemes of most languages encode pronunciation, though some are more explicit than others. Languages like Spanish have a straightforward mapping between its graphemes and phonemes, while this mapping is more convoluted for languages like English. Spoken languages such as Cantonese present even more challenges in pronunciation modeling: (1) they do not have a standard written form, (2) the closest graphemic origins are logographic Han characters, of which only a subset of these logographic characters implicitly encodes pronunciation. In this work, we propose a multimodal approach to predict the pronunciation of Cantonese logographic characters, using neural networks with a geometric representation of logographs and pronunciation of cognates in historically related languages. The proposed framework improves performance by 18.1% and 25.0% respective to unimodal and multimodal baselines.

Nancy F. Chen | Hoang Gia Ngo | Minh Nguyen | Minh Nguyen | H. Ngo

[1] Hao Xin,et al. Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components , 2017, EMNLP.

[2] Timothy Baldwin,et al. Sub-character Neural Language Modelling in Japanese , 2017, SWCN@EMNLP.

[3] Makoto Miwa,et al. Utilizing Visual Forms of Japanese Characters for Neural Review Classification , 2017, IJCNLP.

[4] Mantaro J. Hashimoto. Current Developments in Sino-Vietnamese Studies. , 1978 .

[5] R. Treiman,et al. Syllable Structure and the Distribution of Phonemes in English Syllables , 1997 .

[6] Frederick Liu,et al. Learning Character-level Compositionality with Visual Features , 2017, ACL.

[7] Erik Cambria,et al. Radical-Based Hierarchical Embeddings for Chinese Sentiment Analysis at Sentence Level , 2017, FLAIRS.

[8] Rui Li,et al. Multi-Granularity Chinese Word Embedding , 2016, EMNLP.

[9] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10] Leo Loveday,et al. Language Contact in Japan: A Sociolinguistic History , 1998 .

[11] Haizhou Li,et al. Grapheme-to-phoneme conversion for Chinese text-to-speech , 2004, INTERSPEECH.

[12] V. V. Heuven,et al. Mutual intelligibility of Chinese dialects experimentally tested , 2009 .

[13] Masanori Hattori,et al. Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition , 2016, NLPCC/ICCPOL.

[14] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[15] Chao Liu,et al. Radical Embedding: Delving Deeper to Chinese Radicals , 2015, ACL.

[16] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Masafumi Hagiwara,et al. Radical-level Ideograph Encoder for RNN-based Sentiment Analysis of Chinese and Japanese , 2017, ACML.

[18] Falcon Z. Dai,et al. Glyph-aware Embedding of Chinese Characters , 2017, SWCN@EMNLP.

[19] Ho-min Sohn. The Korean language , 1999 .

[20] Holly P. Branigan,et al. Lexical and syntactic representations in closely related languages: Evidence from Cantonese–Mandarin bilinguals , 2011 .

[21] Lei Wu,et al. Dual Long Short-Term Memory Networks for Sub-Character Representation Learning , 2017, ArXiv.

[22] Sanjeev Khudanpur,et al. Acoustic data-driven pronunciation lexicon generation for logographic languages , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23] Zev Handel. The Classification of Chinese , 2015 .

[24] Leo Loveday,et al. Language Contact in Japan: A Socio-Linguistic History , 1996 .

[25] Janet Hui-wen Hsiao,et al. Analysis of a Chinese Phonetic Compound Database: Implications for Orthographic Processing , 2006, Journal of psycholinguistic research.

[26] Mark J. Alves. What ’ s so Chinese about Vietnamese ? , 2014 .

[27] Xuehai Zhou,et al. Natural Language Processing Service Based on Stroke-Level Convolutional Networks for Chinese Text Classification , 2017, 2017 IEEE International Conference on Web Services (ICWS).

[28] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[29] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31] John Defrancis,et al. Graphemic indeterminacy in writing systems , 1996 .