An auditory model for the detection of perceptual onsets and beat tracking in singing

We describe a biophysically motivated model of auditory sal ience and present results which show that the derived measure of salie nce can be used to successfully identify the position of perceptual on sets in a musical stimulus. We evaluate the method using a corpus of unacco mpanied freely sung stimuli. We briefly show that perceptual onsets d tected by the model are in good agreement with those identified by a comb ination of state-of-the-art algorithms and manual correction. We s how that this continuous measure of salience can be used to track and predi ct rhythmic structure on the basis of its periodicity, thus avoiding the necessity forad hocdecisions as to if, or when, an event has occurred.

[1]  Leigh M. Smith,et al.  Modelling Rhythm Perception by Continuous Time-Frequency Analysis , 1996, ICMC.

[2]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[3]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[4]  S. Mallat A wavelet tour of signal processing , 1998 .

[5]  Stefano Fusi,et al.  Multiple views of the response of an ensemble of spectro-temporal features support concurrent classification of utterance, prosody, sex and speaker identity , 2005, Network.

[6]  Jordi Janer,et al.  Phonetic-based Mappings in Voice-Driven Sound Synthesis , 2016, SIGMAP.

[7]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Neil P. McAngus Todd,et al.  The auditory “Primal Sketch”: A multiscale model of rhythmic grouping , 1994 .

[9]  Richard Kronland-Martinet,et al.  Reading and Understanding Continuous Wavelet Transforms , 1989 .

[10]  P. Fraisse 6 – Rhythm and Tempo , 1982 .

[11]  Peter Kovesi,et al.  A Continuous Time-Frequency Approach To Representing Rhythmic Strata , 1996 .

[12]  Sue L. Denham,et al.  The role of transients in auditory processing , 2007, Biosyst..

[13]  Martin Coath,et al.  A computational model of auditory feature extraction and sound classification , 2005 .

[14]  J. Fritz,et al.  Dynamics of Precise Spike Timing in Primary Auditory Cortex , 2004, The Journal of Neuroscience.

[15]  Sue L. Denham,et al.  Robust sound classification through the representation of similarity using response fields derived from stimuli during early experience , 2005, Biological Cybernetics.