Fast MIR in a Sparse Transform Domain

We consider in this paper sparse audio coding as an alternative to transform audio coding for efficient MIR in the transform domain. We use an existing audio coder based on a sparse representation in a union of MDCT bases, and propose a fast algorithm to compute mid-level representations for beat tracking and chord recognition, respectively an onset detection function and a chromagram. The resulting transform domain system is significantly faster than a comparable state-of-the-art system while obtaining close performance above 8 kbps.

[1]  Information technology — Coding of audio-visual objects — Part 3 : Audio Technologies de l ' information — Codage des objets audiovisuels — Partie , 1999 .

[2]  Werner Oomen,et al.  Parametric Coding for High-Quality Audio , 2002 .

[3]  Sacha Krstulovic,et al.  Mptk: Matching Pursuit Made Tractable , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[4]  Ye Wang,et al.  A compressed domain beat detector using MP3 audio bitstreams , 2001, MULTIMEDIA '01.

[5]  Ye Wang,et al.  Pop Music Beat Detection in the Huffman Coded Domain , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[6]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[8]  Malcolm Slaney,et al.  Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized Audio , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Matthew E. P. Davies,et al.  Context-Dependent Beat Tracking of Musical Audio , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Stephen W. Hainsworth,et al.  Techniques for the Automated Analysis of Musical Audio , 2004 .

[11]  Laurent Daudet,et al.  Extending Fine-Grain Scalable Audio Coding to Very Low Bitrates using Overcomplete Dictionaries , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[12]  Juan Pablo Bello,et al.  A Robust Mid-Level Representation for Harmonic Content in Music Signals , 2005, ISMIR.