A Comparison of Sound Onset Detection Algorithms with Emphasis on Psychoacoustically Motivated Detection Functions

Whilst many onset detection algorithms for musical events in audio signals have been proposed, comparative studies of their efficacy for segmentation tasks are much rarer. This paper follows the lead of Bello et al. 04, using the same hand marked test database as a benchmark for comparison. That previous paper did not include in the comparison a psychoacoustically motivated algorithm originally proposed by Klapuri in 1999, an oversight which is corrected herein with respect to a number of variants of that model. Primary test domains are formed of non-pitched percussive (NPP) and pitched non-percussive (PNP) sound events. 16 detection functions are investigated, including a number of novel and recently published models. Different detection functions are seen to perform well in each case, with substantially worse onset detection overall for the PNP case. It is contended that the NPP case is effectively solved by fast intensity change discrimination processes, but that stable pitch cues may provide a better tactic for the latter.

[1]  Eric D. Scheirer,et al.  Towards music understanding without separation: segmenting music with correlogram comodulation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[2]  M. Davies,et al.  A COMPARISON BETWEEN FIXED AND MULTIRESOLUTION ANALYSIS FOR ONSET DETECTION IN MUSICAL SIGNALS , 2004 .

[3]  J. W. Gordon The perceptual attack time of musical tones. , 1987, The Journal of the Acoustical Society of America.

[4]  D. A. Eddins,et al.  Chapter 6 – Temporal Integration and Temporal Resolution , 1995 .

[5]  Joseph Timoney,et al.  IMPLEMENTING LOUDNESS MODELS IN MATLAB , 2004 .

[6]  Hugo Fastl,et al.  Psychoacoustics Facts and Models. 2nd updated edition , 1999 .

[7]  Paul Masri,et al.  Imroved Modelling of Attack Transients in Music Analysis-Resynthesis , 1996, ICMC.

[8]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  M. Davies,et al.  Complex domain onset detection for musical signals , 2003 .

[10]  Kristoffer Jensen,et al.  A Causal Rhythm Grouping , 2004, CMMR.

[11]  日本規格協会 Acoustics : normal equal-loudness-level contours = 音響 : 正常な音の大きさの等感曲線 , 2004 .

[12]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[13]  Malcolm D. Macleod,et al.  Onset Detection in Musical Audio Signals , 2003, ICMC.

[14]  A. Spanias,et al.  Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[15]  Kristoffer Jensen,et al.  Real-Time Beat Estimation Using Feature Extraction , 2003, CMMR.

[16]  Jacqueline Walker,et al.  TIME DOMAIN NOTE AVERAGE ENERGY BASED MUSIC ONSET DETECTION , 2003 .

[17]  Tristan Jehan EVENT-SYNCHRONOUS MUSIC ANALYSIS / SYNTHESIS , 2004 .

[18]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[19]  Thomas Baer,et al.  A model for the prediction of thresholds, loudness, and partial loudness , 1997 .

[20]  Gaël Richard,et al.  Methodology and Tools for the evaluation of automatic onset detection algorithms in music , 2004, ISMIR.

[21]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .