Illustrating the Production of the International Phonetic Alphabet Sounds Using Fast Real-Time Magnetic Resonance Imaging

Recent advances in real-time magnetic resonance imaging (rtMRI) of the upper airway for acquiring speech production data provide unparalleled views of the dynamics of a speaker’s vocal tract at very high frame rates (83 frames per second and even higher). This paper introduces an effort to collect and make available on-line rtMRI data corresponding to a large subset of the sounds of the world’s languages as encoded in the International Phonetic Alphabet, with supplementary English words and phonetically-balanced texts, produced by four prominent phoneticians, using the latest rtMRI technology. The technique images oral as well as laryngeal articulator movements in the production of each sound category. This resource is envisioned as a teaching tool in pronunciation training, second language acquisition, and speech therapy.

[1]  Shrikanth Narayanan,et al.  A fast and flexible MRI system for the study of dynamic vocal tract shaping , 2017, Magnetic resonance in medicine.

[2]  Marc E Miquel,et al.  Recommendations for real‐time speech MRI , 2016, Journal of magnetic resonance imaging : JMRI.

[3]  Shrikanth S. Narayanan,et al.  Region Segmentation in the Frequency Domain Applied to Upper Airway Real-Time Magnetic Resonance Images , 2009, IEEE Transactions on Medical Imaging.

[4]  Yoon-Chul Kim,et al.  Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging [Exploratory DSP] , 2008, IEEE Signal Processing Magazine.

[5]  J.M. Santos,et al.  Flexible real-time magnetic resonance imaging framework , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[6]  E. Vajda Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .

[7]  Shrikanth S. Narayanan,et al.  Characterizing vocal tract dynamics with real-time MRI , 2015 .

[8]  Shrikanth S. Narayanan,et al.  A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis , 2013, INTERSPEECH.

[9]  Shrikanth Narayanan,et al.  Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans. , 2006, The Journal of the Acoustical Society of America.

[10]  Shrikanth S. Narayanan,et al.  Advances in real-time magnetic resonance imaging of the vocal tract for speech science and technology research , 2016, APSIPA Transactions on Signal and Information Processing.

[11]  Shrikanth Narayanan,et al.  An approach to real-time magnetic resonance imaging for speech production. , 2003, The Journal of the Acoustical Society of America.

[12]  Shrikanth S. Narayanan,et al.  Factor analysis of vocal-tract outlines derived from real-time magnetic resonance imaging data , 2015, ICPhS.

[13]  James M Scobbie,et al.  Seeing Speech: an articulatory web resource for the study of phonetics , 2015 .

[14]  Shinji Maeda,et al.  Articulatory VCV Synthesis from EMA Data , 2012, INTERSPEECH.

[15]  Shrikanth Narayanan,et al.  USC-EMO-MRI corpus: An emotional speech production database recorded by real-time magnetic resonance imaging , 2014 .

[16]  Shrikanth Narayanan,et al.  Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC). , 2014, The Journal of the Acoustical Society of America.

[17]  Bosung Kim,et al.  Ultrasound-Enhanced Multimodal Approaches to Pronunciation Teaching and Learning , 2015 .