Complex Communication Dynamics: Exploring the Structure of an Academic Talk

Communication is a multimodal phenomenon. The cognitive mechanisms supporting it are still understudied. We explored a natural dataset of academic lectures to determine how communication modalities are used and coordinated during the presentation of complex information. Using automated and semi-automated techniques, we extracted and analyzed, from the videos of 30 speakers, measures capturing the dynamics of their body movement, their slide change rate, and various aspects of their speech (speech rate, articulation rate, fundamental frequency, and intensity). There were consistent but statistically subtle patterns in the use of speech rate, articulation rate, intensity, and body motion across the presentation. Principal component analysis also revealed patterns of system-like covariation among modalities. These findings, although tentative, do suggest that the cognitive system is integrating body, slides, and speech in a coordinated manner during natural language use. Further research is needed to clarify the specific coordination patterns that occur between the different modalities.

[1]  Nivja H. Jong,et al.  Praat script to detect syllable nuclei and measure speech rate automatically , 2009, Behavior research methods.

[2]  Wolff-Michael Roth,et al.  On Performing Concepts during Science Lectures. , 2007 .

[3]  M. Andrews,et al.  An acoustical and perceptual analysis of the vocal behavior of classroom teachers. , 1998, Journal of voice : official journal of the Voice Foundation.

[4]  Ezequiel Morsella,et al.  The role of gestures in spatial working memory and speech. , 2004, The American journal of psychology.

[5]  Julia Hirschberg A Corpus-Based Approach to the Study of Speaking Style , 2000 .

[6]  Jan-Peter de Holger N. J. Ruiter,et al.  Projecting the End of a Speaker's Turn: A Cognitive Cornerstone of Conversation , 2006 .

[7]  E. Gibson,et al.  Please Scroll down for Article Language and Cognitive Processes Acoustic Correlates of Information Structure Acoustic Correlates of Information Structure , 2022 .

[8]  Stefanie Shattuck-Hufnagel,et al.  THE TIMING OF SPEECH-ACCOMPANYING GESTURES WITH RESPECT TO PROSODY , 2004 .

[9]  Per B. Brockhoff,et al.  lmerTest Package: Tests in Linear Mixed Effects Models , 2017 .

[10]  E. Hutchins,et al.  Constructing Meaning from Space, Gesture, and Speech , 1997 .

[11]  Michael J. Richardson,et al.  Complex dynamical systems in social and personality psychology: theory, modeling, and analysis , 2014 .

[12]  Louis-Philippe Morency,et al.  Acoustic and para-verbal indicators of persuasiveness in social multimedia , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Chiu-yu Tseng,et al.  Fluent speech prosody: Framework and modeling , 2005, Speech Commun..

[14]  A J Schafer,et al.  Intonational Disambiguation in Sentence Production and Comprehension , 2000, Journal of psycholinguistic research.

[15]  Daniel C. Richardson,et al.  Conversation and Coordinative Structures , 2009, Top. Cogn. Sci..

[16]  Louis-Philippe Morency,et al.  Cicero - Towards a Multimodal Virtual Audience Platform for Public Speaking Training , 2013, IVA.

[17]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[18]  Kevin Shockley,et al.  Interpersonal Synergies , 2010, Front. Psychology.

[19]  D. McNeill Gesture and Thought , 2005 .

[20]  Francisco Torreira,et al.  Listeners use intonational phrase boundaries to project turn ends in spoken interaction , 2015, J. Phonetics.

[21]  S. Brennan,et al.  Attenuating Information in Spoken Communication: For the Speaker, or for the Addressee?. , 2010 .

[22]  Klaus Krippendorff,et al.  Answering the Call for a Standard Reliability Measure for Coding Data , 2007 .

[23]  Rebecca Hincks,et al.  Measures and perceptions of liveliness in student oral presentation speech: A proposal for an automatic feedback mechanism , 2005 .

[24]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[25]  Julia Hirschberg,et al.  Some intonational characteristics of discourse structure , 1992, ICSLP.

[26]  D. Levasseur,et al.  Pedagogy Meets PowerPoint: A Research Review of the Effects of Computer-Generated Slides in the Classroom , 2006 .

[27]  A. Clark,et al.  The Extended Mind , 1998, Analysis.

[28]  Dan Jurafsky,et al.  Speaker movement correlates with prosodic indicators of engagement , 2014 .

[29]  Wolff‐Michael Roth,et al.  Photographs in lectures: Gestures as meaning-making resources , 2004 .

[30]  Yukiko I. Nakano,et al.  Non-Verbal Cues for Discourse Structure , 2022 .

[31]  E. Wehling Discourse management gestures , 2017 .

[32]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[33]  S. Kita,et al.  Individual Differences in Frequency and Saliency of Speech-Accompanying Gestures: The Role of Cognitive Abilities and Empathy , 2013, Journal of experimental psychology. General.

[34]  D. McNeill,et al.  Speech-gesture mismatches: Evidence for one underlying representation of linguistic and nonlinguistic information , 1998 .

[35]  H. H. Clark Coordinating with each other in a material world , 2005 .

[36]  John W. Black,et al.  Relationships among Fundamental Frequency, Vocal Sound Pressure, and Rate of Speaking , 1961 .

[37]  Lei Chen,et al.  Using Multimodal Cues to Analyze MLA'14 Oral Presentation Quality Corpus: Presentation Delivery and Slides Quality , 2014, MLA@ICMI.

[38]  Federico Rossano,et al.  Gaze in Conversation , 2012 .

[39]  Hani Yehia,et al.  Linguistically valid movement behavior measured non-invasively , 2008, AVSP.

[40]  Evelyn McClave,et al.  Gestural beats: The rhythm hypothesis , 1994 .

[41]  Alexandra Paxton,et al.  Frame-differencing methods for measuring bodily synchrony in conversation , 2012, Behavior Research Methods.

[42]  Deirdre Wilson,et al.  Relevance and prosody , 2006 .

[43]  Julia Hirschberg,et al.  The intonational Structuring of Discourse , 1986, ACL.

[44]  Wim Pouw,et al.  The quantification of gesture–speech synchrony: A tutorial and validation of multimodal data acquisition using device-based and video-based motion tracking , 2019, Behavior research methods.

[45]  Mark Hasegawa-Johnson,et al.  Signal-based and expectation-based factors in the perception of prosodic prominence , 2010 .

[46]  Daniel Jurafsky,et al.  Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation , 2009, NAACL.

[47]  Margaret Wilson,et al.  An oscillator model of the timing of turn-taking , 2005, Psychonomic bulletin & review.

[48]  Corinna Elsenbroich Relationship Thinking: Agency, Enchrony, and Human Sociality , 2015 .

[49]  J A Scott Kelso,et al.  Synergies: atoms of brain and behavior. , 2009, Advances in experimental medicine and biology.

[50]  J. Hellermann The interactive work of prosody in the IRF exchange: Teacher repetition in feedback moves , 2002, Language in Society.

[51]  Christian Obermeier,et al.  A speaker's gesture style can affect language comprehension: ERP evidence from gesture-speech integration. , 2015, Social cognitive and affective neuroscience.

[52]  D. Barr,et al.  Random effects structure for confirmatory hypothesis testing: Keep it maximal. , 2013, Journal of memory and language.

[53]  John Kane,et al.  An audiovisual political speech analysis incorporating eye-tracking and perception data , 2012, LREC.

[54]  K. Grammer,et al.  Motion patterns in political speech and their influence on personality ratings , 2010 .

[55]  James L. McClelland,et al.  Letting structure emerge: connectionist and dynamical systems approaches to cognition , 2010, Trends in Cognitive Sciences.

[56]  Edwin Hutchins,et al.  How a Cockpit Remembers Its Speeds , 1995, Cogn. Sci..

[57]  C. Goodwin Action and embodiment within situated human interaction , 2000 .

[58]  Sabra E. Brock,et al.  Empowering PowerPoint: Slides and Teaching Effectiveness , 2019, At the Intersection of Education, Marketing, and Transformation.

[59]  J. Trueswell,et al.  Using prosody to avoid ambiguity: Effects of speaker awareness and referential context , 2003 .

[60]  S. Brennan,et al.  Speakers adapt gestures to addressees' knowledge: implications for models of co-speech gesture , 2014 .

[61]  D. McNeill,et al.  Gesture and the Poetics of Prose , 1991 .

[62]  Christopher T. Kello,et al.  Scaling laws in cognitive sciences , 2010, Trends in Cognitive Sciences.

[63]  David Kirsh,et al.  Thinking with external representations , 2010, AI & SOCIETY.

[64]  Willem J. M. Levelt,et al.  Gesture and the communicative intention of the speaker , 2005 .

[65]  Julia Hirschberg,et al.  Acoustic/prosodic and lexical correlates of charismatic speech , 2005, INTERSPEECH.

[66]  R. Krauss,et al.  PSYCHOLOGICAL SCIENCE Research Article GESTURE, SPEECH, AND LEXICAL ACCESS: The Role of Lexical Movements in Speech Production , 2022 .

[67]  Samuel T. Moulton,et al.  Does a presentation’s medium affect its message? PowerPoint, Prezi, and oral presentations , 2017, PloS one.

[68]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..

[69]  G. Vigliocco,et al.  Language as a multimodal phenomenon: implications for language learning, processing and evolution , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[70]  Emiel Krahmer,et al.  Reduction in gesture during the production of repeated references , 2015 .

[71]  Rashid Ansari,et al.  Multimodal signal analysis of prosody and hand motion: Temporal correlation of speech and gestures , 2002, 2002 11th European Signal Processing Conference.

[72]  Kara D. Federmeier,et al.  Verbal working memory predicts co-speech gesture: Evidence from individual differences , 2014, Cognition.

[73]  Martha W. Alibali,et al.  Raise your hand if you’re spatial: Relations between verbal and spatial skills and gesture production , 2007 .

[74]  Martha W. Alibali,et al.  How Teachers Link Ideas in Mathematics Instruction Using Speech and Gesture: A Corpus Analysis , 2014 .

[75]  M. Selting,et al.  Affectivity in conversational storytelling: An analysis of displays of anger or indignation in complaint stories , 2010 .