Residual DNN-CRF Model for Audio Chord Recognition

In this paper, we propose Residual DNN-CRF Model for Audio Chord Recognition. The network architecture of chord recognition using deep learning so far consists of a shallow network of about three layers. Even if Convolutional Neural Network is used, if the hidden layer in the network architecture is shallow, the original power of deep learning will not be demonstrated. Of course it is the same for DNN. Therefore, we propose a network architecture with 15 hidden layers. The extracted features are processed by a Conditional Random Field that decodes the final chord sequence. We provide superior results by building deeper network architecture than before. In addition, we compare the results of the proposed model with the results of the Robbie Williams dataset used in other chord recognition network architectures.

[1]  Gerhard Widmer,et al.  Feature Learning for Chord Recognition: The Deep Chroma Extractor , 2016, ISMIR.

[2]  Matthias Mauch,et al.  Automatic chord transcription from audio using computational models of musical context , 2010 .

[3]  Juhan Nam,et al.  A Classification-Based Polyphonic Piano Transcription Approach Using Learned Feature Representations , 2011, ISMIR.

[4]  Mark Sandler,et al.  Automatic Chord Identifcation using a Quantised Chromagram , 2005 .

[5]  Daniel P. W. Ellis,et al.  MIR_EVAL: A Transparent Implementation of Common MIR Metrics , 2014, ISMIR.

[6]  Takuya Fujishima,et al.  Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.

[7]  Douglas Eck,et al.  Learning Features from Music Audio with Deep Belief Networks , 2010, ISMIR.

[8]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[9]  Alexander Lerch,et al.  Chord Detection Using Deep Learning , 2015, ISMIR.

[10]  Augusto Sarti,et al.  Automatic chord recognition based on the probabilistic modeling of diatonic modal harmony , 2013 .

[11]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[12]  Simon Dixon,et al.  Audio Chord Recognition with a Hybrid Recurrent Neural Network , 2015, ISMIR.

[13]  Simon Dixon,et al.  Approximate Note Transcription for the Improved Identification of Difficult Chords , 2010, ISMIR.

[14]  Jean-Pierre Martens,et al.  Integrating Musicological Knowledge into a Probabilistic Framework for Chord and Key Extraction , 2010 .

[15]  Maurizio Omologo,et al.  Time-frequency reassigned features for automatic chord recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).