Special Issue : Probabilistic models of cognition Probabilistic models of language processing and acquisition

Probabilistic methods are providing new explanatory approaches to fundamental cognitive science questions of howhumans structure, process and acquire language. This review examines probabilistic models defined over traditional symbolic structures. Language comprehension and production involve probabilistic inference in such models; and acquisition involves choosing the best model, given innate constraints and linguistic and other input. Probabilistic models can account for the learning and processing of language, while maintaining the sophistication of symbolic models. A recent burgeoning of theoretical developments and online corpus creation has enabled large models to be tested, revealing probabilistic constraints in processing, undermining acquisition arguments based on a perceived poverty of the stimulus, and suggesting fruitful links with probabilistic theories of categorization and ambiguity resolution in perception.

[1]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[2]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[3]  James J. Horning,et al.  A Procedure for Grammatical Inference , 1971, IFIP Congress.

[4]  Janet D. Fodor,et al.  The sausage machine: A new two-stage parsing model , 1978, Cognition.

[5]  S. Pinker Formal models of language learning , 1979, Cognition.

[6]  Jung­Il Suh,et al.  On the Variable Rules , 1983 .

[7]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.

[8]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[9]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[10]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[11]  Steve Young,et al.  Applications of stochastic context-free grammars using the Inside-Outside algorithm , 1990 .

[12]  Hinrich Schütze Distributional Part-of-Speech Tagging , 1995, EACL.

[13]  Julie C. Sedivy,et al.  Subject Terms: Linguistics Language Eyes & eyesight Cognition & reasoning , 1995 .

[14]  T. A. Cartwright,et al.  Distributional regularity and phonotactic constraints are useful for segmentation , 1996, Cognition.

[15]  P. Resnik Selectional constraints: an information-theoretic model and its computational realization , 1996, Cognition.

[16]  Judith L. Klavans,et al.  Book Reviews: The Balancing Act: Combining Symbolic and Statistical Approaches to Language , 1997, CL.

[17]  Daniel Jurafsky,et al.  A Probabilistic Model of Lexical and Syntactic Access and Disambiguation , 1996, Cogn. Sci..

[18]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[19]  Mark S. Seidenberg,et al.  Language Acquisition and Use: Learning and Applying Probabilistic Constraints , 1997, Science.

[20]  Susan T. Dumais,et al.  A solution to Plato''s problem: representation of knowledge , 1997 .

[21]  Susan T. Dumais,et al.  The latent semantic analysis theory of knowledge , 1997 .

[22]  Martin J. Pickering,et al.  The rational of analysis of inquiry: The case of parsing. , 1998 .

[23]  Nick Chater,et al.  Distributional Information: A Powerful Cue for Acquiring Syntactic Categories , 1998, Cogn. Sci..

[24]  Peter M. Vishton,et al.  Rule learning by seven-month-old infants. , 1999, Science.

[25]  P. Eimas Do infants learn grammar with algebra or statistics? , 1999, Science.

[26]  S. Pinker Words and Rules: The Ingredients of Language , 1999 .

[27]  Maryellen C. MacDonald,et al.  A probabilistic constraints approach to language acquisition and processing , 1999, Cogn. Sci..

[28]  D Norris,et al.  Merging information in speech recognition: Feedback is never necessary , 2000, Behavioral and Brain Sciences.

[29]  U. Hahn,et al.  German Inflection: Single Route or Dual Route? , 2000, Cognitive Psychology.

[30]  Matthew W. Crocker,et al.  Ambiguity Resolution in Sentence Processing: Evidence against Frequency-Based Accounts , 2000 .

[31]  M W Crocker,et al.  Wide-Coverage Probabilistic Sentence Processing , 2000, Journal of psycholinguistic research.

[32]  Daniel Jurafsky,et al.  A Bayesian Model Predicts Human Parse Preference and Reading Times in Sentence Processing , 2001, NIPS.

[33]  J. Pierrehumbert Stochastic phonology , 2001 .

[34]  P. Boersma,et al.  Empirical Tests of the Gradual Learning Algorithm , 2001, Linguistic Inquiry.

[35]  John A. Goldsmith,et al.  Unsupervised Learning of the Morphology of a Natural Language , 2001, CL.

[36]  Willem J. M. Levelt,et al.  Relations between speech production and speech perception: Some behavioral and neurological observations , 2001 .

[37]  S. Riezler,et al.  Statistical Models of Language Learning and Use , 2001 .

[38]  John Goldsmith,et al.  Probabilistic Models of Grammar: Phonology as Information Minimization , 2002 .

[39]  Dan Klein,et al.  A Generative Constituent-Context Model for Improved Grammar Induction , 2002, ACL.

[40]  Charles D. Yang,et al.  Empirical re-assessment of stimulus poverty arguments , 2002 .

[41]  Barbara C. Scholz,et al.  Empirical assessment of stimulus poverty arguments , 2002 .

[42]  Christopher D. Manning,et al.  Probabilistic Syntax , 2002 .

[43]  Alexander Clark,et al.  Combining Distributional and Morphological Information for Part of Speech Induction , 2003, EACL.

[44]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[45]  E. Gibson,et al.  Disambiguation preferences and corpus frequencies in noun phrase conjunction , 2003 .

[46]  B. Hayes,et al.  Rules vs. analogy in English past tenses: a computational/experimental study , 2003, Cognition.

[47]  John Hale,et al.  The Information Conveyed by Words in Sentences , 2003, Journal of psycholinguistic research.

[48]  R. Shillcock,et al.  Eye Movements Reveal the On-Line Computation of Lexical Probabilities During Reading , 2003, Psychological science.

[49]  Nick Chater,et al.  What can be learned from positive data? Insights from an ‘ideal learner’. Commentary on ‘A Multiple process solution to the logical problem of language acquisition’ by Brian MacWhinney , 2004, Journal of Child Language.

[50]  E. Newport,et al.  Learning at a distance I. Statistical learning of non-adjacent dependencies , 2004, Cognitive Psychology.

[51]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[52]  M. Pickering,et al.  Toward a mechanistic psychology of dialogue , 2004, Behavioral and Brain Sciences.

[53]  Dan Klein,et al.  Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[54]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Fei Xu,et al.  Word Learning as Bayesian Inference: Evidence from Preschoolers , 2005 .

[56]  Suzanne Stevenson,et al.  Exploiting a Verb Lexicon in Automatic Semantic Role Labelling , 2005, HLT.

[57]  Stephan Oepen,et al.  Stochastic HPSG Parse Disambiguation using the Redwoods Corpus , 2005 .

[58]  R. Harald Baayen,et al.  Semantic Density and Past-Tense Formation in Three Germanic Languages , 2005 .

[59]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[60]  Edward Gibson,et al.  Consequences of the Serial Nature of Linguistic Input for Sentenial Complexity , 2005, Cogn. Sci..

[61]  R. Baayen,et al.  Shifting paradigms: gradient structure in morphology , 2005, Trends in Cognitive Sciences.

[62]  Partha Niyogi,et al.  Book Reviews: The Computational Nature of Language Learning and Evolution, by Partha Niyogi , 2007, CL.

[63]  J. Tenenbaum,et al.  Theory-based Bayesian models of inductive learning and reasoning , 2006, Trends in Cognitive Sciences.

[64]  Konrad Paul Kording,et al.  Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .

[65]  Dennis Norris,et al.  The Bayesian reader: explaining word recognition as an optimal Bayesian decision process. , 2006, Psychological review.

[66]  Marc Brysbaert,et al.  Relative clause attachment in Dutch: On-line comprehension corresponds to corpus frequencies when lexical variables are taken into account , 2006 .

[67]  Konrad P. Körding,et al.  Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2006 .

[68]  A. Yuille,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[69]  Edward Gibson,et al.  The Interaction of Top-Down and Bottom-Up Statistics in the Resolution of Syntactic Category Ambiguity. , 2006 .

[70]  Dan Jurafsky,et al.  Pragmatics and Computational Linguistics , 2008 .