Learning Morphology with Pair Hidden Markov Models

In this paper I present a novel Machine Learning approach to the acquisition of stochastic string transductions based on Pair Hidden Markov Models (PHMMs), a model used in computational biology. I show how these models can be used to learn morphological processes in a variety of languages, including English, German and Arabic. Previous techniques for learning morphology have been restricted to languages with essentially concatenative morphology.

[1]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[2]  Alan Prince,et al.  Foot and word in prosodic morphology: The Arabic broken plural , 1990 .

[3]  Alfred V. Aho,et al.  Syntax Directed Translations and the Pushdown Assembler , 1969, J. Comput. Syst. Sci..

[4]  Dimitri Kanevsky,et al.  An inequality for rational functions with applications to some statistical estimation problems , 1991, IEEE Trans. Inf. Theory.

[5]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[6]  Daniel Gildea,et al.  Learning Bias and Phonological-Rule Induction , 1996, CL.

[7]  Andrew R. Golding,et al.  A morphology component for language programs , 1985 .

[8]  Guy Aston,et al.  The BNC Handbook: Exploring the British National Corpus with SARA , 1998 .

[9]  S. Quartz Neural networks, nativism, and the plausibility of constructivism , 1993, Cognition.

[10]  C. Watkins Dynamic Alignment Kernels , 1999 .

[11]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[12]  Dan Roth,et al.  Learning in Natural Language , 1999, IJCAI.

[13]  Lauri Karttunen,et al.  Finite-State Non-Concatenative Morphotactics , 2000, ACL.

[14]  John A. Goldsmith,et al.  Unsupervised Learning of the Morphology of a Natural Language , 2001, CL.

[15]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.

[16]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[17]  Yves Schabes,et al.  On the Use of Sequential Transducers in Natural Language Processing , 1997 .

[18]  Alfred V. Aho,et al.  Properties of Syntax Directed Translations , 1969, J. Comput. Syst. Sci..

[19]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[20]  Dana Angluin,et al.  Inductive Inference of Formal Languages from Positive Data , 1980, Inf. Control..

[21]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars, with Application to Segmentation, Bracketing, and Alignment of Parallel Corpora , 1995, IJCAI.

[22]  Yves Normandin Maximum Mutual Information Estimation of Hidden Markov Models , 1996 .

[23]  Daniel Jurafsky,et al.  Knowledge-Free Induction of Morphology Using Latent Semantic Analysis , 2000, CoNLL/LLL.

[24]  Peter N. Yianilos,et al.  Learning String-Edit Distance , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[26]  Kim Plunkett,et al.  A Connectionist Model of the Arabic Plural System , 1997 .

[27]  Francisco Casacuberta,et al.  Submission to ICGI-2000 Computational complexity of problems on probabilistic grammars and transducers , 2007 .

[28]  Raymond J. Mooney,et al.  Induction of First-Order Decision Lists: Results on Learning the Past Tense of English Verbs , 1995, J. Artif. Intell. Res..

[29]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[30]  Eric Sven Ristad,et al.  A natural law of succession , 1995, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[31]  Charles X. Ling,et al.  Learning the Past Tense of English Verbs: The Symbolic Pattern Associator vs. Connectionist Models , 1993, J. Artif. Intell. Res..

[32]  Stephen Muggleton,et al.  Analogical Prediction , 1999, ILP.

[33]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[34]  Martin Kay,et al.  Regular Models of Phonological Rule Systems , 1994, CL.

[35]  Anthony J. Vitale,et al.  Algorithms for Grapheme-Phoneme Translation for English and French: Applications for Database Searches and Speech Synthesis , 1997, CL.

[36]  Alexander Clark,et al.  Inducing Syntactic Categories by Context Distribution Clustering , 2000, CoNLL/LLL.