An Automatic Child-Directed Speech Detector for the Study of Child Language Development

In this paper, we present an automatic child-directed speech detection system to be used in the study of child language development. Child-directed speech (CDS) is speech that is directed by caregivers towards infants. It is not uncommon for corpora used in child language development studies to have a combination of CDS and non-CDS. As the size of the corpora used in these studies grow, manual annotation of CDS becomes impractical. Our automatic CDS detector addresses this issue. The focus of this paper is to propose and evaluate different sets of features for the detection of CDS, using several offthe-shelf classifiers. First, we look at the performance of a set of acoustic features. We continue by combining these acoustic features with several linguistic and eventually contextual features. Using the full set of features, our CDS detector was able to correctly identify CDS with an accuracy of .88 and F1 score of .87 using Naive Bayes.

[1]  Kazuyuki Shinohara,et al.  Discrimination between mothers’ infant- and adult-directed speech using hidden Markov models , 2011, Neuroscience Research.

[2]  C. A. Ferguson,et al.  Talking to Children: Language Input and Acquisition , 1979 .

[3]  P. Boersma Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[4]  P. Kuhl,et al.  Maternal speech to infants in a tonal language: Support for universal prosodic features in motherese. , 1988 .

[5]  A. Fernald,et al.  Expanded Intonation Contours in Mothers' Speech to Newborns. , 1984 .

[6]  Deb Roy,et al.  Fast transcription of unstructured audio recordings , 2009, INTERSPEECH.

[7]  A. Bryk,et al.  Early vocabulary growth: Relation to language input and gender. , 1991 .

[8]  Elissa L. Newport,et al.  The Role of Stress and Position in Determining First Words , 1992 .

[9]  Soroush Vosoughi,et al.  A longitudinal study of prosodic exaggeration in child-directed speech , 2012 .

[10]  Mohamed Chetouani,et al.  Motherese detection based on segmental and supra-segmental features , 2008, 2008 19th International Conference on Pattern Recognition.

[11]  Michael C. Frank,et al.  Contributions of Prosodic and Distributional Features of Caregivers' Speech in Early Word Learning , 2010 .

[12]  Brian Scassellati,et al.  Prosody recognition in male infant-directed speech , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[13]  Stefanie Tellex,et al.  The Human Speechome Project , 2006, EELC.

[14]  Steve Young,et al.  The HTK book , 1995 .

[15]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[16]  Soroush Vosoughi,et al.  Interactions of caregiver speech and early word learning in the Speechome corpus : computational explorations , 2010 .