Automatic recognition of schwa variants in spontaneous Hungarian speech

This paper analyzes the nature of the process involved in optional vowel reduction in Hungarian, and the acoustic structure of schwa variants in spontaneous speech. The study focuses on the acoustic patterns of both the basic realizations of Hungarian vowels and their realizations as neutral vowels (schwas), as well as on the design, implementation, and evaluation of a set of algorithms for the recognition of both types of realizations from the speech waveform. The authors address the question whether schwas form a unified group of vowels or they show some dependence on the originally intended articulation of the vowel they stand for. The acoustic study uses a database consisting of over 4,000 utterances extracted from continuous speech, and recorded from 19 speakers. The authors propose methods for the recognition of neutral vowels depending on the various vowels they replace in spontaneous speech. Mel-Frequency Cepstral Coefficients are calculated and used for the training of Hidden Markov Models. The recognition system was trained on 2,500 utterances and then tested on 1,500 utterances. The results show that a neutral vowel can be detected in 72% of all occurrences. Stressed and unstressed syllables can be distinguished in 92% of all cases. Neutralized vowels do not form a unified group of phoneme realizations. The pronunciation of schwa heavily depends on the original articulation configuration of the intended vowel.

[1]  P. Ladefoged A course in phonetics , 1975 .

[2]  Vowel reduction in spontaneous spoken Dutch , 2003 .

[3]  Jack Halpern The Contribution of Lexical Resources to Natural Language Processing of CJK Languages , 2006, ISCSLP.

[4]  Bernard Harmegnies,et al.  A study of style-induced vowel variability: Laboratory versus spontaneous speech in Spanish , 1992, Speech Commun..

[5]  Mengjie Zhang,et al.  Detecting Stress in Spoken English using Decision Trees and Support Vector Machines , 2004, ACSW.

[6]  B. Lindblom Spectrographic Study of Vowel Reduction , 1963 .

[7]  David Patterson,et al.  Corpora Analyses of Frequency of Schwa Deletion in Conversational American English , 2003, Phonetica.

[8]  Esther Janse,et al.  Fast speech timing in Dutch: durational correlates of lexical stress and pitch accent , 2000, INTERSPEECH.

[9]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[10]  György Szaszák,et al.  Speech Recognition Supported by Prosodic Information for Fixed Stress Languages , 2007, TSD.

[11]  Michael S. Scordilis,et al.  Development and comparison of three syllable stress classifiers , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[12]  Edward Flemming,et al.  Rosa's roses: reduced vowels in American English , 2007, Journal of the International Phonetic Association.

[13]  Martin Karafiát,et al.  Advances in Acoustic Modeling for the Recognition of Czech , 2008, TSD.

[14]  Lou Boves,et al.  Acoustic characteristics of lexical stress in continuous telephone speech , 1999, Speech Commun..

[15]  Dick R. van Bergem,et al.  A model of coarticulatory effects on the schwa , 1994, Speech Commun..

[16]  Pierre Delattre,et al.  An Acoustic and Articulatory Study of Vowel Reduction in Four Languages. , 1969 .

[17]  Wolfgang U. Dressler,et al.  Explaining Natural Phonology , 1984, Phonology Yearbook.

[18]  J. Pickett,et al.  The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology , 1998 .

[19]  Maria Gosy The manifold function of schwa , 2004 .

[20]  Edward Flemming,et al.  The Phonetics of Schwa Vowels , 2009 .

[21]  F. J. van Beinum What's in a schwa? Durational and spectral analysis of natural continuous speech and diphones in Dutch , 1994 .

[22]  Florien J. van Beinum,et al.  The role of 'given' and 'new in the production and perception of vowel contrasts in read text and in spontaneous speech , 1989, EUROSPEECH.

[23]  Paul Taylor,et al.  Modelling intonational structure using hidden markov models. , 1997 .

[24]  Marija Tabain,et al.  Adaptive Dispersion Theory and Phonological Vowel Reduction in Russian , 2005, Phonetica.

[25]  P. Ladefoged,et al.  The sounds of the world's languages , 1996 .

[26]  J. Bernthal,et al.  Articulation and Phonological Disorders , 1988 .

[27]  H. Timothy Bunnell,et al.  Schwa variants in american English , 2008, INTERSPEECH.

[28]  Sadaoki Furui RECENT ADVANCES IN AUTOMATIC SPEECH SUMMARIZATION , 2006, 2006 IEEE Spoken Language Technology Workshop.

[29]  C. Browman,et al.  Articulatory Phonology: An Overview , 1992, Phonetica.

[30]  Miklós Törkenczy,et al.  The phonology of Hungarian , 2000 .

[31]  Sadaoki Furui Recent Progress in Corpus-Based Spontaneous Speech Recognition , 2005, IEICE Trans. Inf. Syst..