Deriving Musical Structures from Signal Analysis for Music Audio Summary Generation: "Sequence" and "State" Approach

In this paper, we investigate the derivation of musical structures directly from signal analysis with the aim of generating visual and audio summaries. From the audio signal, we first derive features – static features (MFCC, chromagram) or proposed dynamic features. Two approaches are then studied in order to derive automatically the structure of a piece of music. The sequence approach considers the audio signal as a repetition of sequences of events. Sequences are derived from the similarity matrix of the features by a proposed algorithm based on a 2D structuring filter and pattern matching. The state approach considers the audio signal as a succession of states. Since human segmentation and grouping performs better upon subsequent hearings, this natural approach is followed here using a proposed multi-pass approach combining time segmentation and unsupervised learning methods. Both sequence and state representations are used for the creation of an audio summary using various techniques.

[1]  George Tzanetakis,et al.  Multifeature audio segmentation for browsing and annotation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[2]  Beth Logan,et al.  Music summarization using key phrases , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Irène Deliège,et al.  A perceptual approach to contemporary musical forms , 1989 .

[4]  François Pachet,et al.  The CUIDADO Project , 2002, ISMIR.

[5]  Nicola Orio,et al.  Alignment of Monophonic and Polyphonic Music to a Score , 2001, ICMC.

[6]  Marc Leman,et al.  Discovering Structure and Repetition in Musical Audio , 2002 .

[7]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[8]  Jonathan Foote,et al.  Automatic Music Summarization via Similarity Analysis , 2002, ISMIR.

[9]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[10]  Maxime Crochemore,et al.  Algorithms For Computing Approximate Repetitions In Musical Sequences , 2002, Int. J. Comput. Math..

[11]  Mark Sandler,et al.  Segmentation of Musical Signals Using Hidden Markov Models. , 2001 .

[12]  Rajeev Raman,et al.  String-Matching techniques for musical similarity and melodic recognition , 1998 .

[13]  Paul Mermelstein,et al.  Experiments in syllable-based recognition of continuous speech , 1980, ICASSP.

[14]  Mark Sandler,et al.  Finding Repeating Patterns in Acoustic Musical Signals : Applications for Audio Thumbnailing , 2002 .

[15]  Ning Hu,et al.  Pattern Discovery Techniques for Music Audio , 2002, ISMIR.

[16]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[17]  Jorma Tarhio,et al.  Searching monophonic patterns within polyphonic sources , 2000 .

[18]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[19]  Xavier Rodet,et al.  Toward Automatic Music Audio Summary Generation from Signal Analysis , 2002, ISMIR.

[20]  Stéphane Rossignol,et al.  Segmentation et indexation des signaux sonores musicaux , 2000 .

[21]  D. Ruelle,et al.  Recurrence Plots of Dynamical Systems , 1987 .

[22]  Jonathan Foote,et al.  ARTHUR: Retrieving Orchestral Music by Long-Term Structure , 2000, ISMIR.

[23]  Jonathan Foote,et al.  Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[24]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.