论文信息 - A Statistical Model for Word Discovery in Transcribed Speech

A Statistical Model for Word Discovery in Transcribed Speech

A statistical model for segmentation and word discovery in continuous speech is presented. An incremental unsupervised learning algorithm to infer word boundaries based on this model is described. Results are also presented of empirical tests showing that the algorithm is competitive with other models that have been used for similar tasks.

Anand Venkataraman | A. Venkataraman

[1] P. Jusczyk,et al. Infants' memory for spoken words. , 1997, Science.

[2] J. Pind. The Discovery of Spoken Language, Peter W. Jusczyk (Ed.). MIT Press (1997), ISBN 0 262 10058 4 , 1997 .

[3] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[4] P. Jusczyk,et al. Phonotactic and Prosodic Effects on Word Segmentation in Infants , 1999, Cognitive Psychology.

[5] Morten H. Christiansen,et al. Learning to Segment Speech Using Multiple Cues: A Connectionist Model , 1998 .

[6] T. Poggio,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 2001 .

[7] David Haussler,et al. Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..

[8] Ian H. Witten,et al. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[9] Gwyneth Tseng,et al. Chinese text segmentation for text retrieval: achievements and problems , 1993 .

[10] T. A. Cartwright,et al. Distributional regularity and phonotactic constraints are useful for segmentation , 1996, Cognition.

[11] E. Newport,et al. WORD SEGMENTATION : THE ROLE OF DISTRIBUTIONAL CUES , 1996 .