Assessing kinetic meaning of music and dance via deep cross-modal retrieval

Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.

[1]  Milan Stehlík,et al.  “SPOCU”: scaled polynomial constant unit activation function , 2020, Neural Computing and Applications.

[2]  Karen Bennett,et al.  The Language of Dance , 2008 .

[3]  Musical Meaning and Expression , 1994 .

[4]  Asunción López-Varela Azcárate Intertextuality and Intermediality as Cross-cultural Comunication Tools: A Critical Inquiry , 2011 .

[5]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[6]  M. Kiefer,et al.  Conceptual representations in mind and brain: Theoretical developments, current evidence and future directions , 2012, Cortex.

[7]  Mark Johnson The meaning of the body : aesthetics of human understanding , 2007 .

[8]  T. R. Knapp Canonical correlation analysis: A general parametric significance-testing system. , 1978 .

[9]  Alexei A. Efros,et al.  Everybody Dance Now , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Alexander Refsum Jensenius,et al.  Evaluating a Collection of Sound-Tracing Data of Melodic Phrases , 2018, ISMIR.

[11]  Dan Tidhar,et al.  Musicians are more consistent: Gestural cross-modal mappings of pitch, loudness and tempo in real-time , 2014, Front. Psychol..

[12]  David Martins de Matos,et al.  Low-dimensional Embodied Semantics for Music and Language , 2019, ArXiv.

[13]  Bin Xiao,et al.  Bottom-up Higher-Resolution Networks for Multi-Person Pose Estimation , 2019, ArXiv.

[14]  Patrik N. Juslin,et al.  What does music express? Basic emotions and beyond , 2013, Front. Psychol..

[15]  Marc Leman,et al.  An embodied approach to music semantics , 2010 .

[16]  Jeffrey R. Binder,et al.  The Neural Career of Sensory-motor Metaphors , 2011, Journal of Cognitive Neuroscience.

[17]  Marco Iacoboni,et al.  Embodied Listening and Timbre: Perceptual, Acoustical, and Neural Correlates , 2018 .

[18]  Lei Chen,et al.  Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval , 2017, ACM Trans. Multim. Comput. Commun. Appl..

[19]  Mark Reybrouck,et al.  Musical Sense-Making and the Concept of Affordance: An Ecosemiotic and Experiential Approach , 2012, Biosemiotics.

[20]  T. Wheatley,et al.  Music and movement share a dynamic structure that supports universal expressions of emotion , 2012, Proceedings of the National Academy of Sciences.

[21]  Marina Korsakova-Kreyn,et al.  Two-Level Model of Embodied Cognition in Music , 2018, Psychomusicology: Music, Mind, and Brain.

[22]  Lawrence M. Zbikowski Conceptual Models and Cross-Domain Mapping: New Perspectives on Theories of Music and Hierarchy , 1997 .

[23]  Yoshitaka Fuwamoto,et al.  Analysis of music–brain interaction with simultaneous measurement of regional cerebral blood flow and electroencephalogram beta rhythm in human subjects , 1999, Neuroscience Letters.

[24]  Marc Leman,et al.  Action-based effects on music perception , 2013, Front. Psychol..

[25]  Zachary Wallmark Semantic Crosstalk in Timbre Perception , 2019, Music & Science.

[26]  M. Reybrouck From Sound to Music: An Evolutionary Approach to Musical Semantics , 2013, Biosemiotics.

[27]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Charles S. Peirce,et al.  Collected Papers of C. S. Peirce. Vol. V. Pragmatism and Pragmaticism , 1937 .

[30]  P. Juslin From everyday emotions to aesthetic emotions: towards a unified theory of musical emotions. , 2013, Physics of life reviews.

[31]  Karen Livescu,et al.  Multi-view Recurrent Neural Acoustic Word Embeddings , 2016, ICLR.

[32]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[33]  Peter Kivy,et al.  The Corded Shell: Reflections on Musical Expression , 1980 .

[34]  Minna Huotilainen,et al.  Newborn infants' auditory system is sensitive to Western music chord categories , 2013, Front. Psychol..

[35]  R. Butler,et al.  Localization of tonal stimuli in the vertical plane. , 1968, The Journal of the Acoustical Society of America.

[36]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[37]  Yves Bestgen,et al.  Exact Expected Average Precision of the Random Baseline for System Evaluation , 2015, Prague Bull. Math. Linguistics.

[38]  Nancy Kanwisher,et al.  Toward a universal decoder of linguistic meaning from brain activation , 2018, Nature Communications.

[39]  G. Lakoff Mapping the brain's metaphor circuitry: metaphorical thought in everyday reason , 2014, Front. Hum. Neurosci..

[40]  Istvan Molnar-Szakacs,et al.  Music and mirror neurons: from motion to 'e'motion. , 2006, Social cognitive and affective neuroscience.

[41]  Jocelyn Wolfe An investigation into the nature and function of metaphor in advanced music instruction , 2018, Research Studies in Music Education.

[42]  Jay L. Lemke,et al.  Intertextuality and educational research , 1992 .

[43]  Z. Eitan,et al.  HOW MUSIC MOVES: Musical Parameters and Listeners' Images of Motion , 2006 .

[44]  Karl J. Friston,et al.  Predictive Processes and the Peculiar Case of Music , 2019, Trends in Cognitive Sciences.

[45]  Nikoleta Popa Blanariu Towards a Framework of a Semiotics of Dance , 2013 .

[46]  B. Ross,et al.  Internalized Timing of Isochronous Sounds Is Represented in Neuromagnetic Beta Oscillations , 2012, The Journal of Neuroscience.

[47]  Effects of Aural and Visual Conditions on Response to Perceived Artistic Tension in Music and Dance , 1999 .

[48]  Homer H. Chen,et al.  Music Emotion Recognition , 2011 .

[49]  Stephen McAdams,et al.  Musical Forces and Melodic Expectations: Comparing Computer Models and Experimental Results , 2004 .

[50]  Steve Larson,et al.  "Something in the Way She Moves"-Metaphors of Musical Motion , 2003 .

[51]  Justin M. London Musical and linguistic speech acts , 1996 .

[52]  George Lakoff,et al.  Explaining Embodied Cognition Results , 2012, Top. Cogn. Sci..

[53]  J. Decety Do imagined and executed actions share the same neural substrate? , 1996, Brain research. Cognitive brain research.

[54]  Gary Tomlinson,et al.  A Million Years of Music: The Emergence of Human Modernity , 2015 .

[55]  R. Jackendoff,et al.  The capacity for music: What is it, and what’s special about it? , 2006, Cognition.

[56]  C. Morris Foundations of the theory of signs , 1938 .

[57]  Manfred Clynes,et al.  Sentics: The touch of emotions , 1977 .

[58]  Ellen Winner,et al.  "Metaphorical" Mapping in Human Infants , 1981 .

[59]  Phil Blunsom,et al.  Multilingual Distributed Representations without Word Alignment , 2013, ICLR 2014.

[60]  G. Rizzolatti,et al.  Understanding motor events: a neurophysiological study , 2004, Experimental Brain Research.

[61]  C. Krumhansl,et al.  Can Dance Reflect the Structural and Expressive Qualities of Music? A Perceptual Experiment on Balanchine's Choreography of Mozart's Divertimento No. 15 , 1997 .

[62]  Vinoo Alluri,et al.  Timbre and Affect Dimensions: Evidence from Affect and Similarity Ratings and Acoustic Correlates of Isolated Instrument Sounds , 2012 .

[63]  R. Laban,et al.  The mastery of movement , 1950 .

[64]  R. J. Frego Effects of Aural and Visual Conditions on Response to Perceived Artistic Tension in Music and Dance , 1999 .

[65]  J. Annett On knowing how to do things: a theory of motor imagery. , 1996, Brain research. Cognitive brain research.

[66]  J. Driver,et al.  Audiovisual links in exogenous covert spatial orienting , 1997, Perception & psychophysics.

[67]  Zohar Eitan,et al.  How music touches: Musical parameters and listeners’ audio-tactile metaphorical mappings , 2011 .

[68]  M. Leman,et al.  The Role of Embodiment in the Perception of Music , 2015 .

[69]  J. Matyja Embodied Music Cognition: Trouble Ahead, Trouble Behind , 2016, Front. Psychol..

[70]  Zachary Wallmark A corpus analysis of timbre semantics in orchestration treatises , 2019 .

[71]  A Berthoz,et al.  The role of inhibition in the hierarchical gating of executed and imagined movements. , 1996, Brain research. Cognitive brain research.

[72]  D. Moelants,et al.  Walking on music. , 2007, Human movement science.

[73]  L. Trainor,et al.  Hearing what the body feels: Auditory encoding of rhythmic movement , 2007, Cognition.

[74]  M. Reybrouck Music as Environment: An Ecological and Biosemiotic Approach , 2014, Behavioral sciences.

[75]  Nicholas Cook,et al.  Analysing Musical Multimedia , 1998 .

[76]  Steve L Arson Musical Forces and Melodic Expectations: Comparing Computer Models and Experimental Results , 2004 .

[77]  M. Tervaniemi,et al.  From symbols to sounds: visual symbolic information activates sound representations. , 2004, Psychophysiology.

[78]  Henrique Rochelle Rethinking Dance Theory Through Semiotics , 2015 .

[79]  M. Leman Embodied Music Cognition and Mediation Technology , 2007 .

[80]  Istvan Molnar-Szakacs,et al.  Being Together in Time: Musical Experience and the Mirror Neuron System , 2009 .