Natural Language Grammatical Inference

This project is concerned with programming a computer to make predictions about which words are most likely to follow a small segment of English text. At first this may seem a strange problem, but I intend to show that there exist a wide range of applications that would benefit from such a program. Indeed, my motivation for approaching this problem was to provide a way of improving the accuracy of speech recognition systems. Additionally, I am interested with the problem of Grammatical Inference. In fact, the word prediction problem and the Grammatical Inference problem are intertwined, and it seems that approaching either one will lead to the other. Grammatical Inference entails inferring a grammar for an arbitrary language from a finite set of sample sentences in the language. It is quite easy to measure the performance of a word prediction system, providing that its prediction is given as a probability distribution. This allows us to compare our predictor with others, such as the trigram predictor developed at the IBM Thomas J. Watson Research Centre by Jelinek et al. A comparison between the IBM predictor and the algorithm presented in this dissertation indicates that the method I have used is able to make broader generalisations about the language under investigation, and produces more accurate predictions about texts generated by artificial grammars.

[1]  Alex Waibel,et al.  Robust connectionist parsing of spoken language , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Amir Averbuch,et al.  An IBM PC based large-vocabulary isolated-utterance speech recognizer , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  G. A. Miller,et al.  Finitary models of language users , 1963 .

[4]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985 .

[5]  John Cocke,et al.  A statistical approach to French/English translation , 1988, TMI.

[6]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[7]  Volker Steinbiss,et al.  Cooccurrence smoothing for stochastic language modeling , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Taylor L. Booth,et al.  Grammatical Inference: Introduction and Survey-Part II , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  A. Wrigley Parse tree n-grams for spoken language modelling , 1993 .

[10]  Frederick Jelinek,et al.  Basic Methods of Probabilistic Context Free Grammars , 1992 .

[11]  Michael Picheny,et al.  A method for the construction of acoustic Markov models for words , 1993, IEEE Trans. Speech Audio Process..

[12]  Daniel I. A. Cohen,et al.  Introduction to computer theory , 1986 .

[13]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  J. P. Ueberla Analysis of a simple bipos language model-attempt at a strategy to improve language models for speech recognition , 1993 .

[15]  Taylor L. Booth,et al.  Grammatical Inference: Introduction and Survey - Part II , 1975, IEEE Transactions on Systems, Man, and Cybernetics.

[16]  S. M. Lucas New directions in grammatical inference , 1993 .

[17]  David John Hutches Data structures and algorithms for the efficient representation and retrieval of incremental lexical information , 1993 .

[18]  Raoul N. Smith Probabilistic Performance Models of Language , 1973 .

[19]  John D. Lafferty,et al.  Computation of the Probability of Initial Substring Generation by Stochastic Context-Free Grammars , 1991, Comput. Linguistics.

[20]  Julian M. Kupiec,et al.  Robust part-of-speech tagging using a hidden Markov model , 1992 .

[21]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[22]  W. N. Locke,et al.  Machine Translation of Languages: Fourteen Essays , 1955 .

[23]  Richard Timon Daly,et al.  Applications of the mathematical theory of linguistics , 1974 .

[24]  Lalit R. Bahl,et al.  Experiments with the Tangora 20,000 word speech recognizer , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  M. Minsky The Society of Mind , 1986 .

[26]  Taylor L. Booth,et al.  Grammatical Inference: Introduction and Survey-Part I , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Willis H. Tsai,et al.  COMBINING STATISTICAL AND STRUCTURAL METHODS , 1990 .

[28]  Noam Chomsky,et al.  Aspects of the Theory of Syntax. , 1966 .

[29]  Hinrich Schütze,et al.  Part-of-Speech Tagging Using a Variable Memory Markov Model , 1994, ACL.

[30]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[31]  Gareth Jones,et al.  Adaptive Statistical and Grammar Models of Language for Application to Speech Recognition , 1993 .

[32]  Lalit R. Bahl,et al.  A tree-based statistical language model for natural language speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[33]  Van Nam Tran,et al.  Syntactic pattern recognition , 1978 .

[34]  J R Cohen,et al.  Application of an auditory model to speech recognition. , 1989, The Journal of the Acoustical Society of America.

[35]  Yorick A. Wilks Machine translation and the artificial intelligence paradigm of language processes , 1983 .

[36]  Frederick Jelinek,et al.  Markov Source Modeling of Text Generation , 1985 .

[37]  Michael Picheny,et al.  Large vocabulary natural language continuous speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[38]  Satosi Watanabe,et al.  Knowing and guessing , 1969 .

[39]  Walter A. Sedelow,et al.  Science and human language , 1983 .

[40]  Hinrich Schütze,et al.  Part-of-Speech Induction From Scratch , 1993, ACL.

[41]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[42]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[43]  A. Ramsay Inference in language processing , 1988 .