From Bach to the Beatles: The Simulation of Human Tonal Expectation Using Ecologically-Trained Predictive Models

Tonal structure is in part conveyed by statistical regularities between musical events, and research has shown that computational models reflect tonal structure in music by capturing these regularities in schematic constructs like pitch histograms. Of the few studies that model the acquisition of perceptual learning from musical data, most have employed self-organizing models that learn a topology of static descriptions of musical contexts. Also, the stimuli used to train these models are often symbolic rather than acoustically faithful representations of musical material. In this work we investigate whether sequential predictive models of musical memory (specifically, recurrent neural networks), trained on audio from commercial CD recordings, induce tonal knowledge in a similar manner to listeners (as shown in behavioral studies in music perception). Our experiments indicate that various types of recurrent neural networks produce musical expectations that clearly convey tonal structure. Furthermore, the results imply that although implicit knowledge of tonal structure is a necessary condition for accurate musical expectation, the most accurate predictive models also use other cues beyond the tonal structure of the musical context.

[1]  Klaus Obermayer,et al.  A new method for tracking modulations in tonal music in audio data format , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[2]  G. Bidelman,et al.  Tone Language Speakers and Musicians Share Enhanced Perceptual and Cognitive Abilities for Musical Pitch: Evidence for Bidirectionality between the Domains of Language and Music , 2013, PloS one.

[3]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[4]  C. Krumhansl,et al.  A Theory of Tonal Hierarchies in Music , 2010 .

[5]  B. Tillmann,et al.  Implicit learning of tonality: a self-organizing approach. , 2000, Psychological review.

[6]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[7]  Marcus T. Pearce,et al.  Information-Theoretic Properties of Auditory Sequences Dynamically Influence Expectation and Memory , 2018, Cogn. Sci..

[8]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[9]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[10]  F. Keil,et al.  Acquisition of the hierarchy of tonal functions in music , 1982, Memory & cognition.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Carlos Eduardo Cancino Chacón,et al.  Developing Tonal Perception through Unsupervised Learning , 2014, ISMIR.

[13]  C. Krumhansl,et al.  Music psychology:tonal structures in perception and memory. , 1991, Annual review of psychology.

[14]  Edward W. Large,et al.  A Neurodynamic Account of Musical Tonality , 2016 .

[15]  Gerhard Widmer,et al.  Getting Closer to the Essence of Music , 2016, ACM Trans. Intell. Syst. Technol..

[16]  Marc Leman,et al.  A Model of Retroactive Tone-Center Perception , 1995 .

[17]  L. Cuddy,et al.  Recovery of the tonal hierarchy: Some comparisons across age and levels of musical experience , 1987, Perception & psychophysics.

[18]  R. Jackendoff,et al.  A Generative Theory of Tonal Music , 1985 .

[19]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[20]  M A Schmuckler,et al.  Harmonic and rhythmic influences on musical expectancy , 1994, Perception & psychophysics.

[21]  C. Krumhansl,et al.  Measuring and Modeling Real-Time Responses to Music: The Dynamics of Tonality Induction , 2003, Perception.

[22]  P. Janata,et al.  A combined model of sensory and cognitive representations underlying tonal expectations in music: from audio signals to behavior. , 2014, Psychological review.

[23]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[24]  David R. W. Sears,et al.  Perceiving the Classical Cadence , 2014 .

[25]  C. Krumhansl,et al.  Tracing the dynamic changes in perceived tonal organization in a spatial representation of musical keys. , 1982, Psychological review.

[26]  Gerhard Widmer,et al.  Automatic Alignment of Music Performances with Structural Differences , 2013, ISMIR.

[27]  Kat Agres,et al.  Harmonics co-occurrences bootstrap pitch and tonality perception in music : Evidence from a statistical unsupervised learning model , 2015 .

[28]  Richard Durbin,et al.  A dimension reduction framework for understanding cortical maps , 1990, Nature.

[29]  Matthias Abend Cognitive Foundations Of Musical Pitch , 2016 .

[30]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[31]  Elizabeth K. Johnson,et al.  Statistical learning of tone sequences by human infants and adults , 1999, Cognition.

[32]  Ying Zhang,et al.  On Multiplicative Integration with Recurrent Neural Networks , 2016, NIPS.