Utilising natural cross-modal mappings for visual control of feature-based sound synthesis

This paper presents the results of an investigation into audio-visual (AV) correspondences conducted as part of the development of Morpheme, a painting interface to control a corpus-based concatenative sound synthesis algorithm. Previous research has identified strong AV correspondences between dimensions such as pitch and vertical position or loudness and size. However, these correspondences are usually established empirically by only varying a single audio or visual parameter. Although it is recognised that the perception of AV correspondences is affected by the interaction between the parameters of auditory or visual stimuli when these are complex multidimensional objects, there has been little research into perceived AV correspondences when complex dynamic sounds are involved. We conducted an experiment in which two AV mapping strategies and three audio corpora were empirically evaluated. 110 participants were asked to rate the perceived similarity of six AV associations. The results confirmed that size/loudness, vertical position/pitch, colour brightness/spectral brightness are strongly associated. A weaker but significant association was found between texture granularity and sound dissonance, as well as colour complexity and sound dissonance. Harmonicity was found to have a moderating effect on the perceived strengths of these associations: the higher the harmonicity of the sounds, the stronger the perceived AV associations.

[1]  Scott P. Johnson,et al.  Preverbal Infants’ Sensitivity to Synaesthetic Cross-Modality Correspondences , 2010, Psychological science.

[2]  Daniel Leech-Wilkinson,et al.  Investigating the influence of musical training on cross-modal correspondences and sensorimotor skills in a real-time drawing paradigm , 2014 .

[3]  N. Kriegeskorte,et al.  Inverse MDS: Inferring Dissimilarity Structure from Multiple Item Arrangements , 2012, Front. Psychology.

[4]  R. Goldstone An efficient method for obtaining similarity data , 1994 .

[5]  Grégory Leplâtre The Effectiveness of Two Audiovisual Mappings to Control a Concatenative Synthesiser , 2017 .

[6]  A. Majid,et al.  Prelinguistic Infants Are Sensitive to Space-Pitch Associations Found Across Cultures , 2014, Psychological science.

[7]  L E Marks,et al.  On associations of light and sound: the mediation of brightness, pitch, and loudness. , 1974, The American journal of psychology.

[8]  Mats B. Küssner,et al.  Shape, drawing and gesture : cross-modal mappings of sound and music , 2014 .

[9]  C. Spence Crossmodal correspondences: A tutorial review , 2011, Attention, perception & psychophysics.

[10]  L E Marks,et al.  Perceiving similarity and comprehending metaphor. , 1988, Monographs of the Society for Research in Child Development.

[11]  Donald A. Norman,et al.  Some observations on mental models , 1987 .

[12]  Anne Treisman,et al.  Natural cross-modal mappings between visual and auditory features. , 2011, Journal of vision.

[13]  George Athanasopoulos,et al.  Cross-Cultural Representations of Musical Shape , 2013 .

[14]  Scott D. Lipscomb,et al.  PERCEIVED MATCH BETWEEN VISUAL PARAMETERS AND AUDITORY CORRELATES: AN EXPERIMENTAL MULTIMEDIA INVESTIGATION , 2004 .

[15]  Norbert Schnell,et al.  Mapping Through Listening , 2014, Computer Music Journal.

[16]  Lawrence E Marks,et al.  Lower pitch is larger, yet falling pitches shrink. , 2014, Experimental psychology.

[17]  Davide Rocchesso,et al.  The Sonification Handbook , 2011 .

[18]  R Walker,et al.  The effects of culture, environment, age, and musical training on choices of visual metaphors for sound , 1987, Perception & psychophysics.

[19]  M. García-Pérez,et al.  Cellwise Residual Analysis in Two-Way Contingency Tables , 2003 .

[20]  Ellen Winner,et al.  "Metaphorical" Mapping in Human Infants , 1981 .

[21]  William Lidwell,et al.  Universal principles of design : 100 ways to enhance usability,influence perception, increase appeal, make better, designdecisions, and teach through design , 2003 .

[22]  Lawrence E. Marks,et al.  On cross-modal similarity: the perceptual structure of pitch, loudness, and brightness , 1989 .

[23]  Robert L. Goldstone The role of similarity in categorization: providing a groundwork , 1994, Cognition.

[24]  T. Mark Beasley,et al.  Multiple Regression Approach to Analyzing Contingency Tables: Post Hoc and Planned Comparison Procedures. , 1995 .

[25]  Colin Ware,et al.  Information Visualization: Perception for Design , 2000 .

[26]  Sabina Pauen,et al.  Cross-modal mapping of visual and acoustic displays in infants: The effect of dynamic and static components , 2013 .

[27]  L E Marks,et al.  On cross-modal similarity: the perceptual structure of pitch, loudness, and brightness. , 1989, Journal of experimental psychology. Human perception and performance.

[28]  Donald A. Norman,et al.  Affordance, conventions, and design , 1999, INTR.

[29]  T. Matsuzawa,et al.  Visuoauditory mappings between high luminance and high pitch are shared by chimpanzees (Pan troglodytes) and humans , 2011, Proceedings of the National Academy of Sciences.

[30]  William Lidwell,et al.  Universal Principles of Design , 2003 .

[31]  Grégory Leplâtre,et al.  Evaluation of a Sketching Interface to control a concatenative synthesiser , 2016, ICMC.

[32]  C. Spence Audiovisual multisensory integration , 2007 .

[33]  Zohar Eitan,et al.  How pitch and loudness shape musical space and motion , 2013 .

[34]  Augoustinos Tsiros,et al.  A multidimensional sketching interface for visual interaction with corpus-based concatenative sound synthesis , 2016 .

[35]  Patrick Susini,et al.  Perceptual evaluation of sound-producing objects , 2013 .

[36]  Daphne Maurer,et al.  Do small white balls squeak? Pitch-object correspondences in young children , 2004, Cognitive, affective & behavioral neuroscience.

[37]  Diemo Schwarz,et al.  REAL-TIME CORPUS-BASED CONCATENATIVE SYNTHESIS WITH CATART , 2006 .