Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony

The segmentation of music into intro-chorus-verse-outro, and similar segments, is a difficult topic. A method for performing automatic segmentation based on features related to rhythm, timbre, and harmony is presented, and compared, between the features and between the features and manual segmentation of a database of 48 songs. Standard information retrieval performance measures are used in the comparison, and it is shown that the timbre-related feature performs best.

[1]  Matthew Cooper,et al.  Summarizing popular music via structural similarity analysis , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[2]  Tue Haste Andersen,et al.  Mixxx: Towards Novel DJ Interfaces , 2003, NIME.

[3]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[4]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[5]  Xavier Rodet,et al.  Signal-based Music Structure Discovery for Music Audio Summary Generation , 2003, ICMC.

[6]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[7]  Jonathan Foote,et al.  Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[8]  Alan Smaill,et al.  Music and artificial intelligence : Second International Conference, ICMAI 2002, Edinburgh, Scotland, UK, September 12-14, 2002 : proceedings , 2002 .

[9]  Kristoffer Jensen Perceptual Atomic noise , 2005, ICMC.

[10]  T. Jehan,et al.  Hierarchical multi-class self similarities , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[11]  Peter Desain,et al.  A (De)Composable Theory of Rhythm Perception , 1992 .

[12]  D. Ruelle,et al.  Recurrence Plots of Dynamical Systems , 1987 .

[13]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[14]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[15]  Nick Collins A Comparison of Sound Onset Detection Algorithms with Emphasis on Psychoacoustically Motivated Detection Functions , 2005 .

[16]  Ran El-Yaniv,et al.  Universal Classification Applied to Musical Sequences , 1998, ICMC.

[17]  Masataka Goto,et al.  A chorus-section detecting method for musical audio signals , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[18]  Andrew Sekey,et al.  Improved 1‐bark bandwidth auditory filters , 1983 .

[19]  Ning Hu,et al.  Pattern Discovery Techniques for Music Audio , 2002, ISMIR.

[20]  George Tzanetakis,et al.  Multifeature audio segmentation for browsing and annotation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[21]  Jieping Xu,et al.  Rhythm-Based Segmentation of Popular Chinese Music , 2005, ISMIR.

[22]  Kristoffer Jensen,et al.  A Causal Rhythm Grouping , 2004, CMMR.