A Supervised Classification Algorithm for Note Onset Detection

This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

[1]  M. Davies,et al.  Complex domain onset detection for musical signals , 2003 .

[2]  Matthew E. P. Davies,et al.  A Combined Phase and Amplitude Based Approach to Onset Detection for Audio Segmentation , 2003 .

[3]  Masataka Goto,et al.  An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds , 2001 .

[4]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[5]  Peter Desain,et al.  On tempo tracking: Tempogram Representation and Kalman filtering , 2000, ICMC.

[6]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[7]  Judith C. Brown,et al.  An efficient algorithm for the calculation of a constant Q transform , 1992 .

[8]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[9]  Mark B. Sandler,et al.  On the use of phase and energy for musical onset detection in the complex domain , 2004, IEEE Signal Processing Letters.

[10]  Mark B. Sandler,et al.  Phase-based note onset detection for music signals , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[11]  Avi Pfeffer,et al.  A Hierarchical Approach to Onset Detection , 2004, ICMC.

[12]  Ali Taylan Cemgil,et al.  Monte Carlo Methods for Tempo Tracking and Rhythm Quantization , 2011, J. Artif. Intell. Res..

[13]  Simon J. Godsill,et al.  Detection of abrupt spectral changes using support vector machines an application to audio signal segmentation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Douglas Eck A Tempo-Extraction Algorithm Using an Autocorrelation Phase Matrix and Shannon Entropy , 2005 .

[16]  Alenka Kavcic,et al.  Neural Networks for Note Onset Detection in Piano Music , 2002 .

[17]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[18]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[19]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[20]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Gaël Richard,et al.  Methodology and Tools for the evaluation of automatic onset detection algorithms in music , 2004, ISMIR.

[22]  Petri Toiviainen,et al.  THE ROLE OF ACCENT PERIODICITIES IN METER INDUCTION: A CLASSIFICATION STUDY , 2004 .

[23]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[24]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[25]  Douglas Eck Meter and Autocorrelation∗ , 2005 .

[26]  John F. Kolen,et al.  Resonance and the Perception of Musical Meter , 1994, Connect. Sci..

[27]  Judith C. Brown Determination of the meter of musical scores by autocorrelation , 1993 .

[28]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[29]  Stephen Cox,et al.  Finding An Optimal Segmentation for Audio Genre Classification , 2005, ISMIR.

[30]  Douglas Eck Finding downbeats with a relaxation oscillator , 2002, Psychological research.