Semantic segmentation and summarization of music: methods based on tonality and recurrent structure

This paper describes a study on automatic music segmentation and summarization from audio signals. The paper inquires scientifically into the nature of human perception of music and offers a practical solution to difficult problems of machine intelligence for automated multimedia content analysis and information retrieval. Specifically, three problems are addressed: segmentation based on tonality analysis, segmentation based on recurrent structural analysis, and summarization. Experimental results are evaluated quantitatively, demonstrating the promise of the proposed methods

[1]  Wei Chai,et al.  Structural analysis of musical signals via pattern matching , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  Arbee L. P. Chen,et al.  Discovering nontrivial repeating patterns in music data , 2001, IEEE Trans. Multim..

[3]  Barry Vercoe,et al.  Music thumbnailing via structural analysis , 2003, ACM Multimedia.

[4]  Cheng Yang MACS: music audio characteristic sequence indexing for similarity retrieval , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[5]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[6]  Gary Burns,et al.  A typology of ‘hooks’ in popular records , 1987, Popular Music.

[7]  Xavier Rodet,et al.  Toward Automatic Music Audio Summary Generation from Signal Analysis , 2002, ISMIR.

[8]  Jonathan Foote,et al.  Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[9]  N. Scaringella,et al.  Automatic genre classification of music content: a survey , 2006, IEEE Signal Process. Mag..

[10]  Ching-Hua Chuan,et al.  Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[11]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[12]  Beth Logan,et al.  Music summarization using key phrases , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[13]  Ning Hu,et al.  Pattern Discovery Techniques for Music Audio , 2002, ISMIR.

[14]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[15]  Alexander H. Waibel,et al.  Strategies for automatic segmentation of audio data , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[16]  Barry Vercoe,et al.  Structural analysis of musical signals for indexing and thumbnailing , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[17]  Daniel P. W. Ellis,et al.  Chord segmentation and recognition using EM-trained hidden markov models , 2003, ISMIR.

[18]  Daniel P. W. Ellis,et al.  Locating singing voice segments within music signals , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).