Music and speech in early development: automatic analysis and classification of prosodic features from two Portuguese variants

In the present study we aim to capture rhythmic and melodic patterning in speech and singing directed to infants. We address this issue by exploring the acoustic features that best predict different classification problems. We built a database composed by infant-directed speech from two Portuguese variants (European vs Brazilian Portuguese) and infant-directed singing from the two cultures, comprising 977 tokens. Machine learning experiments were conducted in order to automatically discriminate between language variants for speech, vocal songs and between interaction contexts. Descriptors related with rhythm exhibited strong predictive ability for both speech and singing language variants’ discrimination tasks, presenting different rhythmic patterning for each variant. Common features could be used by a classifier to discriminate speech and singing, indicating that the processing of speech and singing may share the analysis of the same stimulus properties. With respect to discriminating interaction contexts, pitch-related descriptors showed better performance. We conclude that prosodic cues present in the surrounding sonic environment of an infant are rich sources of information not only to make distinctions between different communicative contexts through melodic cues, but also to provide specific cues about the rhythmic identity of their mother tongue. These prosodic differences may lead to further research on their influence in the development of the infant’s musical representations.

[1]  Franck Ramus,et al.  Perception and acquisition of linguistic rhythm by infants , 2003, Speech Commun..

[2]  F. Ramus,et al.  Correlates of linguistic rhythm in the speech signal , 1999, Cognition.

[3]  Piet Mertens,et al.  The Prosogram: Semi-Automatic Transcription of Prosody Based on a Tonal Perception Model , 2004 .

[4]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[5]  Aniruddh D. Patel Music, Language, and the Brain , 2007 .

[6]  S. Nooteboom,et al.  THE PROSODY OF SPEECH: MELODY AND RHYTHM , 2001 .

[7]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[8]  P. Goldman-Rakic,et al.  Comparison of human infants and rhesus monkeys on Piaget's AB task: evidence for dependence on dorsolateral prefrontal cortex , 2004, Experimental Brain Research.

[9]  A. Fernald Approval and disapproval: infant responsiveness to vocal affect in familiar and unfamiliar languages. , 1993, Child development.

[10]  Marc H. Bornstein,et al.  Infant responses to prototypical melodic contours in parental speech , 1990 .

[11]  Mohamed Chetouani,et al.  Automatic Motherese Detection for Parent-Infant Interaction , 2008 .

[12]  H. Papoušek,et al.  Didactic adjustments in fathers' and mothers' speech to their 3-month-old infants , 1987 .

[13]  J. Cohn,et al.  A combination of vocal fo dynamic and summary features discriminates between three pragmatic categories of infant-directed speech. , 1996, Child development.

[14]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[15]  J Bertoncini,et al.  Do weak syllables count for newborns? , 1997, The Journal of the Acoustical Society of America.

[16]  Sónia Frota,et al.  On the correlates of rhythmic distinctions: The European/Brazilian Portuguese case , 2001 .

[17]  Jérôme Farinas,et al.  Rhythmic unit extraction and modelling for automatic language identification , 2005, Speech Commun..

[18]  Carlos Gussenhoven,et al.  Durational variability in speech and the Rhythm Class Hypothesis , 2002 .

[19]  F. Ramus,et al.  Language identification with suprasegmental cues: a study based on speech resynthesis. , 1999, The Journal of the Acoustical Society of America.

[20]  L. Trainor,et al.  Distinctive messages in infant-directed lullabies and play songs. , 1999, Developmental psychology.

[21]  J. V. Santen,et al.  The analysis of contextual effects on segmental duration , 1990 .

[22]  Malcolm Slaney,et al.  BabyEars: A recognition system for affective vocalizations , 2003, Speech Commun..

[23]  Gérard Bailly,et al.  Characterisation of rhythmic patterns for text-to-speech synthesis , 1994, Speech Communication.

[24]  A. Fernald Intonation and Communicative Intent in Mothers' Speech to Infants: Is the Melody the Message?. , 1989 .

[25]  K. Pike,et al.  The intonation of American English , 1946 .

[26]  L. Trainor,et al.  The Acoustic Basis of Preferences for Infant-Directed Singing , 1997 .

[27]  S. Trehub,et al.  Maternal singing in cross-cultural perspective , 1993 .

[28]  Low Ee Ling,et al.  Q uantitative Characterizations of Speech Rhythm: Syllable-Timing in Singapore English , 2000, Language and speech.

[29]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[30]  N. Masataka The origins of language and the evolution of music: A comparative perspective. , 2009, Physics of life reviews.

[31]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[32]  Aniruddh D. Patel An Empirical Method for Comparing Pitch Patterns in Spoken and Musical Melodies: A Comment on J.G.S. Pearl's “Eavesdropping with a Master: Leos Janácek and the Music of Speech.” , 2006 .

[33]  L. Trainor,et al.  Long-term memory for music: infants remember tempo and timbre. , 2004, Developmental science.

[34]  J. Mehler,et al.  Morae and Syllables: Rhythmical Basis of Speech Representations in Neonates , 1995, Language and speech.

[35]  A. Fernald,et al.  Intonation and communicative intent in mothers' speech to infants: is the melody the message? , 1989, Child development.

[36]  Fabien Ringeval,et al.  Exploiting a Vowel Based Approach for Acted Emotion Recognition , 2008, COST 2102 Workshop.

[37]  Plínio Almeida Barbosa,et al.  From syntax to acoustic duration: A dynamical model of speech rhythm production , 2007, Speech Commun..

[38]  B. MacWhinney The Childes Project: Tools for Analyzing Talk, Volume II: the Database , 2000 .

[39]  J. Morgan,et al.  SIGNAL TO SYNTAX : Bootstrapping From Speech to Grammar in Early Acquisition , 2008 .

[40]  E. Hannon,et al.  Perceiving speech rhythm in music: Listeners classify instrumental songs according to language of origin , 2009, Cognition.

[41]  H. Papoušek,et al.  The meanings of melodies in motherese in tone and stress languages , 1991 .

[42]  François Pellegrino,et al.  Automatic language identification: an alternative approach to phonetic modelling , 2000, Signal Process..

[43]  Aniruddh D. Patel,et al.  Comparing the rhythm and melody of speech and music: the case of British English and French. , 2006, The Journal of the Acoustical Society of America.

[44]  Mohamed Chetouani,et al.  Automatic Motherese Detection for Face-to-Face Interaction Analysis , 2008, COST 2102 School.

[45]  Antonio Galves Sonority as a basis for rhythmic class discrimination , 2002 .

[46]  Caroline Palmer,et al.  What Is Musical Prosody , 2006 .