Multimodal Coordination of Sound and Movement in Music and Speech

ABSTRACT Speech and music emerge from a spectrum of nested motor and perceptual coordination patterns across timescales of brief movements to actions. Intuitively, this nested clustering in movements should be reflected in sound. We examined similarities and differences in multimodal, multiscale coordination of speech and music using two complementary measures: We computed spectra for envelopes of acoustic amplitudes and motion amplitudes and correlated spectral powers across modalities as a function of frequency. We also correlated smoothed envelopes and examined peaks in their cross-correlation functions. YouTube videos of five different modes of speaking and five different types of music were analyzed. Speech performances yielded stronger, more reliable relationships between sound and movement compared with music. Interestingly, a cappella singing patterned more with music, and improvisational jazz piano patterned more with speech. Results suggest that nested temporal structures in sound and movement are coordinated as a function of communicative aspects of performance.

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  Christopher T. Kello,et al.  “How do humans make sense?” multiscale dynamics and emergent meaning , 2017, New Ideas in Psychology.

[3]  K. Torre,et al.  Time intervals production in tapping and oscillatory motion. , 2004, Human movement science.

[4]  N. A. Bernshteĭn The co-ordination and regulation of movements , 1967 .

[5]  Tom Barney Readers as text processors and performers: A new formula for poetic intonation , 1999 .

[6]  Joshua F. Wiley,et al.  Growth Curve Analysis and Visualization Using R , 2014 .

[7]  D. McNeill So you think gestures are nonverbal , 1985 .

[8]  J. Radinsky,et al.  Method for Analyzing Gestural Communication in Musical Groups , 2017 .

[9]  W. Dunsmuir,et al.  Dangers and uses of cross-correlation in analyzing time series in perception, performance, movement, and neuroscience: The importance of constructing transfer function autoregressive models , 2015, Behavior Research Methods.

[10]  Asif A. Ghazanfar,et al.  The Natural Statistics of Audiovisual Speech , 2009, PLoS Comput. Biol..

[11]  James A. Dixon,et al.  Entrainment and Modulation of Gesture–Speech Synchrony Under Delayed Auditory Feedback , 2018, Cogn. Sci..

[12]  W. Goebl,et al.  Production and perception of legato, portato, and staccato articulation in saxophone playing , 2014, Front. Psychol..

[13]  K. Torre,et al.  Fractal dynamics of human gait: a reassessment of the 1996 data of Hausdorff et al. , 2009, Journal of applied physiology.

[14]  Janet Beavin Bavelas,et al.  Gesturing on the telephone: Independent effects of dialogue and visibility. , 2008 .

[15]  Alexandra Paxton,et al.  Interpersonal Movement Synchrony Responds to High- and Low-Level Conversational Constraints , 2017, Front. Psychol..

[16]  Judith Holler,et al.  Pragmatic aspects of representational gestures: Do speakers use them to clarify verbal ambiguity for the listener? , 2003 .

[17]  Geraint A. Wiggins,et al.  Principles of structure building in music, language and animal song , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[18]  J G Martin,et al.  Rhythmic (hierarchical) versus serial structure in speech and other behavior. , 1972, Psychological review.

[19]  Aniruddh D. Patel,et al.  The evolutionary neuroscience of musical beat perception: the Action Simulation for Auditory Prediction (ASAP) hypothesis , 2013, Front. Syst. Neurosci..

[20]  Asli Ozyurek Do Speakers Design Their Cospeech Gestures for Their Addressees? The Effects of Addressee Location on Representational Gestures , 2002 .

[21]  Rick Dale,et al.  The Fractal Structure of Extended Communicative Performance , 2018, CogSci.

[22]  Stefanie Shattuck-Hufnagel,et al.  THE TIMING OF SPEECH-ACCOMPANYING GESTURES WITH RESPECT TO PROSODY , 2004 .

[23]  Mark K. Tiede,et al.  A Kinematic Study of Prosodic Structure in Articulatory and Manual Gestures: Results from a Novel Method of Data Collection , 2017, Laboratory phonology.

[24]  R. Voss,et al.  ’’1/f noise’’ in music: Music from 1/f noise , 1978 .

[25]  Steven J. Harrison,et al.  Acoustic information about upper limb movement in voicing , 2020, Proceedings of the National Academy of Sciences.

[26]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[27]  Friedemann Pulvermüller,et al.  Is the Sensorimotor Cortex Relevant for Speech Perception and Understanding? An Integrative Review , 2016, Front. Hum. Neurosci..

[28]  Anthony Chemero,et al.  Improvisation and the self-organization of multiple musical bodies , 2015, Front. Psychol..

[29]  Christopher T. Kello,et al.  Complexity matching in dyadic conversation. , 2014, Journal of experimental psychology. General.

[30]  Alexandra Paxton,et al.  Frame-differencing methods for measuring bodily synchrony in conversation , 2012, Behavior Research Methods.

[31]  Steven J. Harrison,et al.  Gesture-speech physics: The biomechanical basis for the emergence of gesture-speech synchrony. , 2020, Journal of experimental psychology. General.

[32]  Christopher T Kello,et al.  A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters. , 2004, The Journal of the Acoustical Society of America.

[33]  Justin M Fine,et al.  Emergent complexity matching in interpersonal coordination: Local dynamics and global variability. , 2015, Journal of experimental psychology. Human perception and performance.

[34]  D. Delignières,et al.  Strong anticipation: complexity matching in interpersonal coordination , 2012, Experimental Brain Research.

[35]  Caroline Palmer,et al.  Rate Effects on Timing, Key Velocity, and Finger Kinematics in Piano Performance , 2011, PloS one.

[36]  Thomas L. Thornton,et al.  Provenance of correlations in psychological data , 2005, Psychonomic bulletin & review.

[37]  G. Orden,et al.  LIVING IN THE PINK: INTENTIONALITY, WELLBEING, AND COMPLEXITY , 2011 .

[38]  Wim Pouw,et al.  The quantification of gesture–speech synchrony: A tutorial and validation of multimodal data acquisition using device-based and video-based motion tracking , 2019, Behavior research methods.

[39]  L. Trainor,et al.  Hearing what the body feels: Auditory encoding of rhythmic movement , 2007, Cognition.

[40]  J. H. Henning How to deliver a speech , 1955 .

[41]  D. W. Allan,et al.  Statistics of atomic frequency standards , 1966 .

[42]  J A Scott Kelso,et al.  Synergies: atoms of brain and behavior. , 2009, Advances in experimental medicine and biology.

[43]  R. Balasubramaniam,et al.  Recruitment of the motor system during music listening: An ALE meta-analysis of fMRI data , 2018, PloS one.

[44]  Alexandre Legros,et al.  A Methodological Note on Nonlinear Time Series Analysis: Is the Open-and Closed-Loop Model of Collins and De Luca (1993) a Statistical Artifact? , 2003, Journal of motor behavior.

[45]  J. Novak Live Poetry: An Integrated Approach to Poetry in Performance. , 2011 .

[46]  D. Barr,et al.  Random effects structure for confirmatory hypothesis testing: Keep it maximal. , 2013, Journal of memory and language.

[47]  Steven J. Harrison,et al.  Acoustic specification of upper limb movement in voicing: Exploratory data report and pre-registration , 2019 .

[48]  Stefan Koelsch,et al.  Processing of hierarchical syntactic structure in music , 2013, Proceedings of the National Academy of Sciences.

[49]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[50]  Euler C. F. Teixeira,et al.  Expressiveness in Music From a Multimodal Perspective , 2018, Music Perception.

[51]  Per B. Brockhoff,et al.  lmerTest Package: Tests in Linear Mixed Effects Models , 2017 .

[52]  R. Zatorre,et al.  When the brain plays music: auditory–motor interactions in music perception and production , 2007, Nature Reviews Neuroscience.

[53]  L. Liebovitch,et al.  "Fractal dynamics of human gait: stability of long-range correlations in stride interval fluctuations". , 1996, Journal of applied physiology.

[54]  D. Barr Random effects structure for testing interactions in linear mixed-effects models , 2013, Front. Psychol..

[55]  S. Brennan,et al.  Speakers adapt gestures to addressees' knowledge: implications for models of co-speech gesture , 2014 .

[56]  S. Boker,et al.  Windowed cross-correlation and peak picking for the analysis of variability in the association between behavioral time series. , 2002, Psychological methods.

[57]  Christopher T. Kello,et al.  Scaling laws in cognitive sciences , 2010, Trends in Cognitive Sciences.

[58]  Michael J. Richardson,et al.  Complexity matching effects in bimanual and interpersonal syncopated finger tapping , 2016, Neuroscience Letters.

[59]  Thomas C. Gunter,et al.  The Role of Iconic Gestures in Speech Disambiguation: ERP Evidence , 2007, Journal of Cognitive Neuroscience.

[60]  C. Kello,et al.  Hierarchical organization in the temporal structure of infant-direct speech and song , 2017, Cognition.

[61]  Asli Özyürek,et al.  Gestural Viewpoint Signals Referent Accessibility , 2013 .

[62]  Caroline Palmer,et al.  Movement-Related Feedback and Temporal Accuracy in Clarinet Performance , 2009 .

[63]  Pilar Prieto,et al.  Communicating Epistemic Stance: How Speech and Gesture Patterns Reflect Epistemicity and Evidentiality , 2016 .

[64]  L. Trainor,et al.  Feeling the Beat: Movement Influences Infant Rhythm Perception , 2005, Science.

[65]  Christopher T. Kello,et al.  Hierarchical temporal structure in music, speech and animal vocalizations: jazz is like a conversation, humpbacks sing like hermit thrushes , 2017, Journal of The Royal Society Interface.

[66]  P. Hagoort,et al.  Synchronization of speech and gesture: evidence for interaction in action. , 2014, Journal of experimental psychology. General.

[67]  Wosuk Ro,et al.  1/f Noise analysis of songs in various genre of music , 2009 .

[68]  Zeshu Shao,et al.  The Role of Synchrony and Ambiguity in Speech–Gesture Integration during Comprehension , 2011, Journal of Cognitive Neuroscience.