Machine Assisted Analysis of Vowel Length Contrasts in Wolof

Growing digital archives and improving algorithms for automatic analysis of text and speech create new research opportunities for fundamental research in phonetics. Such empirical approaches allow statistical evaluation of a much larger set of hypothesis about phonetic variation and its conditioning factors (among them geographical / dialectal variants). This paper illustrates this vision and proposes to challenge automatic methods for the analysis of a not easily observable phenomenon: vowel length contrast. We focus on Wolof, an under-resourced language from Sub-Saharan Africa. In particular, we propose multiple features to make a fine evaluation of the degree of length contrast under different factors such as: read vs semi spontaneous speech ; standard vs dialectal Wolof. Our measures made fully automatically on more than 20k vowel tokens show that our proposed features can highlight different degrees of contrast for each vowel considered. We notably show that contrast is weaker in semi-spontaneous speech and in a non standard semi-spontaneous dialect.

[1]  Cynthia G. Clopper,et al.  Automatic measurement of vowel duration via structured prediction , 2016, The Journal of the Acoustical Society of America.

[2]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[3]  Scott Myers,et al.  Vowel duration and neutralization of vowel length contrasts in Kinyarwanda , 2005, J. Phonetics.

[4]  Cédric Gendrot,et al.  Impact of duration on F1/F2 formant values of oral vowels: an automatic analysis of large broadcast news corpora in French and German , 2005, INTERSPEECH.

[5]  François Pellegrino,et al.  The Perception of a Derived Contrast in Scottish English , 2011, ICPhS.

[6]  Hideaki Kikuchi,et al.  Learning Phonemic Vowel Length from Naturalistic Recordings of Japanese Infant-Directed Speech , 2013, PloS one.

[7]  J. Hartigan,et al.  The Dip Test of Unimodality , 1985 .

[8]  Laurent Besacier,et al.  Speed Perturbation and Vowel Duration Modeling for ASR in Hausa and Wolof Languages , 2016, INTERSPEECH.

[9]  B. Lindblom Speech Production. Vowel Duration and a Model of Lip Mandible Coordination , 1982 .

[10]  Laurent Besacier,et al.  Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of Wolof , 2016, LREC.

[11]  Colette Grinevald On constructing a working typology of the expression of path , 2011 .

[12]  Sheila E. Blumstein,et al.  Effects of speaking rate on the vowel length distinction in Korean , 1991 .

[14]  B. Lindblom,et al.  Durational patterns of Swedish phonology : do they reflect short-term motor memory processes? , 1981 .

[15]  Serge Sauvageot Description synchronique d'un dialecte Wolof : le parler du Dyolof , 1978 .

[16]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[17]  A. House On Vowel Duration in English , 1961 .

[18]  Mark Liberman,et al.  Large-scale analysis of Spanish /s/-lenition using audiobooks , 2016 .

[19]  Dong-jin,et al.  An acoustic and perceptual investigation of the vowel length contrast in Korean , 2016 .