The Vowel Worm : Real-time Mapping and Visualisation of Sung Vowels in Music

This paper presents an approach to predicting vowel quality in vocal music performances, based on common acoustic features (mainly MFCCs). Rather than performing classification, we use linear regression to project spoken or sung vowels into a continuous articulatory space: the IPA Vowel Chart. We introduce a real-time on-line visualisation tool, the Vowel Worm, which builds upon the resulting models and displays the evolution of sung vowels over time in an intuitive manner. The concepts presented in this work can be used for artistic purposes and music teaching.

[1]  Chong-Kwan Un,et al.  On Predictive Coding of Speech Signals , 1985 .

[2]  Hartmut R. Pfitzinger,et al.  The /i/-/a/-/u/-ness of spoken vowels , 2003, INTERSPEECH.

[3]  E. Vajda Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .

[4]  Using statistics to model the vowel spaceMatthew , 1996 .

[5]  Richard Wright,et al.  The vocal joystick data collection effort and vowel corpus , 2006, INTERSPEECH.

[6]  Hartmut R. Pfitzinger,et al.  Acoustic correlates of the IPA vowel diagram , 2003 .

[7]  Phil Rose,et al.  Automatic vowel quality description using a variable mapping to an eight cardinal vowel reference set , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8]  W. Klein,et al.  Vowel spectra, vowel spaces, and vowel identification. , 1970, The Journal of the Acoustical Society of America.

[9]  Hartmut R. Pfitzinger Dynamic vowel quality: a new determination formalism based on perceptual experiments , 1995, EUROSPEECH.

[10]  Richard Wright,et al.  The Vocal Joystick: A Voice-Based Human-Computer Interface for Individuals with Motor Impairments , 2005, HLT.

[11]  John G. Harris,et al.  A Pitch Estimation Algorithm Based on the Smooth Harmonic Average Peak-to-Valley Envelope , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[12]  Daniel Jones An outline of English phonetics , 1956 .

[13]  Hartmut R. Pfitzinger,et al.  An IPA vowel diagram approach to analysing L1 effects on vowel production and perception , 2002, INTERSPEECH.

[14]  J. Sundberg,et al.  The Science of Singing Voice , 1987 .

[15]  Gerhard Widmer,et al.  The Performance Worm: Real Time Visualisation of Expression based on Langner's Tempo-Loudness Animation , 2002, ICMC.