Beat Tracking using Recurrent Neural Network: A Transfer Learning Approach

Deep learning networks have been successfully applied to solve a large number of tasks. The effectiveness of deep learning networks is limited by the amount and the variety of data used for the training. For this reason, deep-learning networks can be applied in scenarios where a huge amount of data are available. In music information retrieval, this is the case of popular genres due to the wider availability of annotated music pieces. Instead, to find sufficient and useful data is a hard task for non widespread genres, like, for instance, traditional and folk music. To address this issue, Transfer Learning has been proposed, i.e., to train a network using a large available dataset and then transfer the learned knowledge (the hierarchical representation) to another task. In this work, we propose an approach to apply transfer learning for beat tracking. We use a deep BLSTM-based RNN as the starting network trained on popular music, and we transfer it to track beats of Greek folk music. In order to evaluate the effectiveness of our approach, we collect a dataset of Greek folk music, and we manually annotate the pieces.

[1]  Matthew E. P. Davies,et al.  Selective Sampling for Beat Tracking Evaluation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Mark Sandler,et al.  Convolutional recurrent neural networks for music classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Orberto,et al.  Evaluation Methods for Musical Audio Beat Tracking Algorithms , 2009 .

[4]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[5]  Florian Krebs,et al.  A Multi-model Approach to Beat Tracking Considering Heterogeneous Music Styles , 2014, ISMIR.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Ajay Srinivasamurthy,et al.  Tracking the "Odd": Meter Inference in a Culturally Diverse Music Corpus , 2014, ISMIR.

[8]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[9]  Augusto Sarti,et al.  Multipath Beat Tracking , 2016 .

[10]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Murat Aydemir,et al.  TEMPO AND PROSODY IN TURKISH TAKSIM IMPROVISATION , 2013 .

[12]  Markus Schedl,et al.  ENHANCED BEAT TRACKING WITH CONTEXT-AWARE NEURAL NETWORKS , 2011 .

[13]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[14]  Florian Krebs,et al.  An Efficient State-Space Model for Joint Tempo and Meter Tracking , 2015, ISMIR.

[15]  Gaël Richard,et al.  Downbeat tracking with multiple features and deep neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Mark Sandler,et al.  Transfer Learning for Music Classification and Regression Tasks , 2017, ISMIR.

[17]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Augusto Sarti,et al.  Unsupervised feature learning for Music Structural Analysis , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[19]  Florian Krebs,et al.  Joint Beat and Downbeat Tracking with Recurrent Neural Networks , 2016, ISMIR.