Brains on Beats

We developed task-optimized deep neural networks (DNNs) that achieved state-of-the-art performance in different evaluation scenarios for automatic music tagging. These DNNs were subsequently used to probe the neural representations of music. Representational similarity analysis revealed the existence of a representational gradient across the superior temporal gyrus (STG). Anterior STG was shown to be more sensitive to low-level stimulus features encoded in shallow DNN layers whereas posterior STG was shown to be more sensitive to high-level stimulus features encoded in deep DNN layers.

[1]  Antonio Torralba,et al.  Deep Neural Networks predict Hierarchical Spatio-temporal Cortical Dynamics of Human Visual Object Recognition , 2016, ArXiv.

[2]  R. Patterson,et al.  The Processing of Temporal Pitch and Melody Information in Auditory Cortex , 2002, Neuron.

[3]  Michael A. Casey,et al.  Population Codes Representing Musical Timbre for High-Level fMRI Categorization of Music Genres , 2011, MLINI.

[4]  Benjamin Schrauwen,et al.  End-to-end learning for music audio , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Tomoyasu Horikawa,et al.  Generic decoding of seen and imagined objects using hierarchical visual features , 2015, Nature Communications.

[6]  Jonathan Winawer,et al.  GLMdenoise: a fast, automated technique for denoising task-based fMRI data , 2013, Front. Neurosci..

[7]  J. Fuster Cortex and mind : unifying cognition , 2003 .

[8]  Mikko Sams,et al.  Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm , 2012, NeuroImage.

[9]  Ha Hong,et al.  A performance-optimized model of neural responses across the ventral visual stream , 2016, bioRxiv.

[10]  Michael I. Mandel,et al.  Evaluation of Algorithms Using Games: The Case of Music Tagging , 2009, ISMIR.

[11]  Rainer Goebel,et al.  Information-based functional brain mapping. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Benjamin Schrauwen,et al.  Multiscale Approaches To Music Audio Feature Learning , 2013, ISMIR.

[13]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[14]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[15]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[16]  J. Moran,et al.  Sensation and perception , 1980 .

[17]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[18]  Jitendra Malik,et al.  Pixels to Voxels: Modeling Visual Representation in the Human Brain , 2014, ArXiv.

[19]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[20]  Marcel van Gerven,et al.  Increasingly complex representations of natural movies across the dorsal stream are shared between subjects , 2017, NeuroImage.

[21]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[22]  Yuwei Cui,et al.  Inferring Nonlinear Neuronal Computation Based on Physiologically Plausible Inputs , 2013, PLoS Comput. Biol..

[23]  Benjamin Schrauwen,et al.  Transfer Learning by Supervised Pre-training for Audio-based Music Classification , 2014, ISMIR.

[24]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[25]  Noël Staeren,et al.  Sound Categories Are Represented as Distributed Patterns in the Human Auditory Cortex , 2009, Current Biology.

[26]  Essa Yacoub,et al.  Encoding of Natural Sounds at Multiple Spectral and Temporal Resolutions in the Human Auditory Cortex , 2014, PLoS Comput. Biol..

[27]  Dimitrios Pantazis,et al.  Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks , 2015, NeuroImage.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.