One Day in Half an Hour: Music Thumbnailing Incorporating Harmony- and Rhythm Structure

A variety of approaches exist to the automatic retrieval of the key part within a musical piece its thumbnail. Most of these however do not use adequate modeling with respect to either harmony or rhythm. In this work we therefore introduce thumbnailing that aims at adequate musical feature modeling. The rhythmic structure is extracted to obtain a segmentation based on beats and bars by an IIR comb-filter bank. Further, we extract chroma energy distribution normalized statistics features of the segmented song improving performance with dB(A) and pitch correction. Harmonic similarities are determined by construction and analysis of a similarity matrix based on the normalized scalar product of the feature vectors. Last, thumbnails are found lending techniques from image processing. Extensive test runs on roughly 24 h of music reveal the high effectiveness of our approach.

[1]  Mark Sandler,et al.  Segmentation of Musical Signals Using Hidden Markov Models. , 2001 .

[2]  Mark B. Sandler,et al.  Theory and Evaluation of a Bayesian Music Structure Extractor , 2005, ISMIR.

[3]  Masataka Goto,et al.  A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Beth Logan,et al.  Music summarization using key phrases , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5]  Björn W. Schuller,et al.  Fast and Robust Meter and Tempo Recognition for the Automatic Discrimination of Ballroom Dance Styles , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[7]  François Pachet,et al.  "The way it Sounds": timbre models for analysis and retrieval of music signals , 2005, IEEE Transactions on Multimedia.

[8]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[9]  M. Muller,et al.  Chroma-based statistical audio features for audio matching , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[10]  Jonathan Foote,et al.  Automatic Music Summarization via Similarity Analysis , 2002, ISMIR.

[11]  Xavier Rodet,et al.  Toward Automatic Music Audio Summary Generation from Signal Analysis , 2002, ISMIR.

[12]  T. Jehan,et al.  Hierarchical multi-class self similarities , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[13]  Meinard Müller,et al.  Enhancing Similarity Matrices for Music Audio Analysis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[14]  John Platt,et al.  Duplicate Detection and Audio Thumbnails with Audio Fingerprinting , 2004 .