Deep Learning for MIR Tutorial

Deep Learning has become state of the art in visual computing and continuously emerges into the Music Information Retrieval (MIR) and audio retrieval domain. In order to bring attention to this topic we propose an introductory tutorial on deep learning for MIR. Besides a general introduction to neural networks, the proposed tutorial covers a wide range of MIR relevant deep learning approaches. \textbf{Convolutional Neural Networks} are currently a de-facto standard for deep learning based audio retrieval. \textbf{Recurrent Neural Networks} have proven to be effective in onset detection tasks such as beat or audio-event detection. \textbf{Siamese Networks} have been shown effective in learning audio representations and distance functions specific for music similarity retrieval. We will incorporate both academic and industrial points of view into the tutorial. Accompanying the tutorial, we will create a Github repository for the content presented at the tutorial as well as references to state of the art work and literature for further reading. This repository will remain public after the conference.

[1]  Björn W. Schuller,et al.  Universal Onset Detection with Bidirectional Long Short-Term Memory Neural Networks , 2010, ISMIR.

[2]  Sebastian Böck,et al.  Improved musical onset detection with Convolutional Neural Networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  T. Lidy Parallel Convolutional Neural Networks for Music Genre and Mood Classification , 2016 .

[4]  Florian Krebs,et al.  Joint Beat and Downbeat Tracking with Recurrent Neural Networks , 2016, ISMIR.

[5]  Thomas Lidy,et al.  A Multi-modal Deep Neural Network approach to Bird-song Identication , 2017, CLEF.

[6]  Markus Schedl,et al.  ENHANCED BEAT TRACKING WITH CONTEXT-AWARE NEURAL NETWORKS , 2011 .

[7]  Thomas Lidy,et al.  Fashion and Apparel Classification using Convolutional Neural Networks , 2017, Forum Media Technology.

[8]  Sergiu Gordea,et al.  The Europeana Sounds Music Information Retrieval Pilot , 2016, EuroMed.

[9]  Andreas Rauber,et al.  Comparing Shallow versus Deep Neural Network Architectures for Automatic Music Genre Classification , 2016, FMT.

[10]  Andreas Rauber,et al.  Harnessing Music-Related Visual Stereotypes for Music Information Retrieval , 2016, ACM Trans. Intell. Syst. Technol..

[11]  Florian Krebs,et al.  ONLINE REAL-TIME ONSET DETECTION WITH RECURRENT NEURAL NETWORKS , 2012 .

[12]  Andreas Rauber,et al.  Multi-Temporal Resolution Convolutional Neural Networks for Acoustic Scene Classification , 2018, ArXiv.

[13]  Xavier Serra,et al.  Experimenting with musically motivated convolutional neural networks , 2016, 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI).

[14]  Thomas Lidy,et al.  CQT-based Convolutional Neural Networks for Audio Scene Classification , 2016, DCASE.

[15]  Markus Schedl,et al.  Polyphonic piano note transcription with recurrent neural networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Florian Krebs,et al.  Accurate Tempo Estimation Based on Recurrent Neural Networks and Resonating Comb Filters , 2015, ISMIR.

[17]  Florian Krebs,et al.  madmom: A New Python Audio and Music Signal Processing Library , 2016, ACM Multimedia.