Paralinguistic mechanisms of production in human "beatboxing": a real-time magnetic resonance imaging study.

Real-time magnetic resonance imaging (rtMRI) was used to examine mechanisms of sound production by an American male beatbox artist. rtMRI was found to be a useful modality with which to study this form of sound production, providing a global dynamic view of the midsagittal vocal tract at frame rates sufficient to observe the movement and coordination of critical articulators. The subject's repertoire included percussion elements generated using a wide range of articulatory and airstream mechanisms. Many of the same mechanisms observed in human speech production were exploited for musical effect, including patterns of articulation that do not occur in the phonologies of the artist's native languages: ejectives and clicks. The data offer insights into the paralinguistic use of phonetic primitives and the ways in which they are coordinated in this style of musical performance. A unified formalism for describing both musical and phonetic dimensions of human vocal percussion performance is proposed. Audio and video data illustrating production and orchestration of beatboxing sound effects are provided in a companion annotated corpus.

[1]  Ajay Kapur,et al.  Query-by-Beat-Boxing: Music Retrieval For The DJ , 2004, ISMIR.

[2]  Anthony Traill,et al.  Phonetic and phonological studies of ¡Xoo Bushman , 1981 .

[3]  Megha Sundara,et al.  Acoustic-phonetics of coronal stops: a cross-language study of Canadian English and Canadian French. , 2005, The Journal of the Acoustical Society of America.

[4]  Austin Williams ARTS IN SOCIETY , 2006 .

[5]  K. Fernow New York , 1896, American Potato Journal.

[6]  Shrikanth Narayanan,et al.  An approach to real-time magnetic resonance imaging for speech production. , 2003, The Journal of the Acoustical Society of America.

[7]  Sharon Hargus,et al.  On the categorization of ejectives: data from Witsuwit'en , 2002, Journal of the International Phonetic Association.

[8]  Dan Stowell,et al.  Delayed Decision-making in Real-time Beatbox Percussion Classification , 2010 .

[9]  Dan Stowell,et al.  Making music through real-time voice timbre analysis: machine learning and timbral control , 2010 .

[10]  Ichiro Fujinaga,et al.  Beatbox Classification Using ACE , 2005, ISMIR.

[11]  Alex McLean,et al.  Words , Movement and Timbre , 2009, NIME.

[12]  P. Ladefoged,et al.  The sounds of the world's languages , 1996 .

[13]  O. Fujimura,et al.  Articulatory Correlates of Prosodic Control: Emotion and Emphasis , 1998, Language and speech.

[14]  George N. Clements,et al.  Explosives, implosives, and nonexplosives: the linguistic function of air pressure differences in stops , 2001 .

[15]  R. Eklund Pulmonic ingressive phonation: Diachronic and synchronic characteristics, distribution and function in animal and human sound production and in human speech , 2008, Journal of the International Phonetic Association.

[16]  Peter Ladefoged,et al.  Linguistic Phonetic Descriptions of Clicks , 1984 .

[17]  Yoon-Chul Kim,et al.  Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging [Exploratory DSP] , 2008, IEEE Signal Processing Magazine.

[18]  Arthur S. Abramson,et al.  Crosslanguage Study of Voicing in Initial Stops , 1963 .

[19]  Steven Bird,et al.  Phonology , 2002, ArXiv.

[20]  Dan Stowell,et al.  Characteristics of the beatboxing vocal style , 2008 .

[21]  Sidney A J Wood,et al.  X-ray and model studies of vowel articulation , 1982 .

[22]  Amanda L. Miller,et al.  Differences in airstream and posterior place of articulation among Nǀuu clicks , 2009, Journal of the International Phonetic Association.

[23]  Khalil Iskarous,et al.  Tongue Body constriction differences in click types , 2003 .

[24]  M. Lindau Phonetic differences in glottalic consonants , 1982 .

[25]  Joyce M. McDonough,et al.  The stop contrasts of the Athabaskan languages , 2008, J. Phonetics.

[26]  Shrikanth S. Narayanan,et al.  A study of intra-speaker and inter-speaker affective variability using electroglottograph and inverse filtered glottal waveforms , 2010, INTERSPEECH.

[27]  J. Kingston The Phonetics of Athabaskan Tonogenesis , 2005 .

[28]  Athanasios Katsamanis,et al.  Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis , 2010, INTERSPEECH.

[29]  Adamantios I. Gafos,et al.  The Articulatory Basis of Locality in Phonology , 1999 .

[30]  Shrikanth Narayanan,et al.  Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans. , 2006, The Journal of the Acoustical Society of America.

[31]  Alyssa Gretchen Smith,et al.  An examination of notation in selected repertoire for multiple percussion , 2005 .

[32]  Björn Granström,et al.  Measurements of articulatory variation in expressive speech for a set of Swedish vowels , 2004, Speech Commun..

[33]  Athanasios Katsamanis,et al.  A Multimodal Real-Time MRI Articulatory Corpus for Speech Research , 2011, INTERSPEECH.

[34]  Ailbhe Ní Chasaide,et al.  The role of voice quality in communicating emotion, mood and attitude , 2003, Speech Commun..

[35]  Ian Maddieson,et al.  Aspects of the phonetics of Tlingit , 2001 .

[36]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..

[37]  John T. Hogan An Analysis of the Temporal Features of Ejective Consonants , 1976 .

[38]  Mickey Hess,et al.  Icons of hip hop : an encyclopedia of the movement, music, and culture , 2007 .

[39]  L Saltzman Elliot,et al.  A Dynamical Approach to Gestural Patterning in Speech Production , 1989 .

[40]  Shrikanth S. Narayanan,et al.  Improved imaging of lingual articulation using real‐time multislice MRI , 2012, Journal of magnetic resonance imaging : JMRI.

[41]  Kurt Stone,et al.  Music Notation in the Twentieth Century: A Practical Guidebook , 1980 .