Exploiting Word Positional Information in Ngram Model for Chinese Text Input Method

This paper aims to improve the performance of the Pinyin-to-Character Conversion system which is the core of Chinese text input method. The ngram model is the current solution to the Pinyin-to- Character Conversion system. This paper enhances the traditional ngram model by relaxing its stationary hypothesis and exploiting the word positional information. The Non-stationary ngram (NS ngram) model is proposed. Several related issues are discussed in detail, including the formal definition, the model implement, the training algorithm and the space complexity of the NS ngram model. Evaluated on the Pinyin-to- Character Conversion task, the NS ngram model outperforms the traditional ngram model significantly with great error rate reductions. Meanwhile, the training algorithm presented in this paper can estimate the parameters in the NS ngram model effectively and efficiently.

[1]  K. Ritter,et al.  The Curse of Dimension and a Universal Method For Numerical Integration , 1997 .

[2]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[3]  Bob Carpenter,et al.  Scaling High-Order Character Language Models to Gigabytes , 2005, ACL 2005.

[4]  Roger K. Moore Computer Speech and Language , 1986 .

[5]  Günther Nürnberger,et al.  Multivariate Approximation and Splines , 1997 .

[6]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[7]  Wei Yuan,et al.  Minimum Sample Risk Methods for Language Modeling , 2005, HLT/EMNLP.

[8]  Roland Kuhn,et al.  Speech Recognition and the Frequency of Recently Used Words: A Modified Markov Model for Natural Language , 1988, COLING.

[9]  Ronald Rosenfeld,et al.  Adaptive Statistical Language Modeling; A Maximum Entropy Approach , 1994 .

[10]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[11]  Jack Perkins,et al.  Pattern recognition in practice , 1980 .

[12]  L MercerRobert,et al.  Class-based n-gram models of natural language , 1992 .

[13]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[14]  李幼升,et al.  Ph , 1989 .

[15]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[16]  Renato De Mori,et al.  A Cache-Based Natural Language Model for Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..