论文信息 - How should we evaluate models of segmentation in artificial language learning

How should we evaluate models of segmentation in artificial language learning

One of the challenges that infants have to solve when learning their native language is to identify the words in a continuous speech stream. Some of the experiments in Artificial Grammar Learning (Saffran, Newport, and Aslin (1996); Saffran, Aslin, and Newport (1996); Aslin, Saffran, and Newport (1998) and many more) investigate this ability. In these experiments, subjects are exposed to an artificial speech stream that contains certain regularities. Infants are typically tested in a preferential looking paradigm; adults, in contrast, in a 2-alternative Forced Choice Tests (2AFC) in which they have to choose between a word and another sequence (typically a partword, a sequence resulting from misplacing boundaries). One of the key findings of AGL is that both infants and adults are sensitive to transitional probabilities and other statistical cues, and can use them to segment the input stream. Several computational models have been proposed to explain such findings. We will review how these models are evaluated and argue that we need a different type of experimental data for model evaluation than is typically used and reported. We present some preliminary results and a model consistent with the data.

Remko Scha | Willem Zuidema | Raquel G. Alhama

[1] Michael C. Frank,et al. Modeling human performance in statistical word segmentation , 2010, Cognition.

[2] A. Vinter,et al. PARSER: A Model for Word Segmentation , 1998 .

[3] E. Newport,et al. WORD SEGMENTATION : THE ROLE OF DISTRIBUTIONAL CUES , 1996 .

[4] T. Griffiths,et al. A Bayesian framework for word segmentation: Exploring the effects of context , 2009, Cognition.

[5] Willem H. Zuidema,et al. RULE LEARNING IN HUMANS AND ANIMALS , 2014 .

[6] Denis Mareschal,et al. TRACX: a recognition-based connectionist framework for sequence segmentation and chunk extraction. , 2011, Psychological review.

[7] Marina Nespor,et al. Signal-Driven Computations in Speech Processing , 2002, Science.

[8] E. Newport,et al. Computation of Conditional Probability Statistics by 8-Month-Old Infants , 1998 .

[9] R N Aslin,et al. Statistical Learning by 8-Month-Old Infants , 1996, Science.