Gaussian Mixture Models For Extraction Of Melodic Lines From Audio Recordings

The presented study deals with extraction of melodic line(s) from polyphonic audio recordings. We base our work on the use of expectation maximization algorithm, which is employed in a two-step procedure that finds melodic lines in audio signals. In the first step, EM is used to find regions in the signal with strong and stable pitch (melodic fragments). In the second step, these fragments are grouped into clusters according to their properties (pitch, loudness...). The obtained clusters represent distinct melodic lines. Gaussian Mixture Models, trained with EM are used for clustering. The paper presents the entire process in more detail and gives some initial results.

[1]  Daniel J. Levitin,et al.  Memory for musical attributes , 1999 .

[2]  Jarno Sepp nen TATUM GRID ANALYSIS OF MUSICAL SIGNALS , 2001 .

[3]  Marc Leman,et al.  An auditory model based transriber of vocal queries , 2003, ISMIR.

[4]  Tong Zhang,et al.  Automatic singer identification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[5]  Masataka Goto A predominant-F/sub 0/ estimation method for CD recordings: MAP estimation using EM algorithm for adaptive tone models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[6]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[7]  Anssi Klapuri,et al.  AUTOMATIC TRANSCRIPTION OF MUSIC , 2003 .

[8]  Juan Pablo,et al.  Towards the automated analysis of simple polyphonic music : a knowledge-based approach , 2003 .

[9]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[10]  Xavier Serra,et al.  A sound analysis/synthesis system based on a deterministic plus stochastic decomposition , 1990 .

[11]  Daniel P. W. Ellis,et al.  Chord segmentation and recognition using EM-trained hidden markov models , 2003, ISMIR.

[12]  D. Weinshall,et al.  Computing Gaussian Mixture Models with EM using Side-Information , 2003 .

[13]  A. Spanias,et al.  Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[14]  Matija Marolt Networks of Adaptive Oscillators for Partial Tracking and Transcription of Music Recordings , 2004 .

[15]  Julius O. Smith,et al.  Spectral modeling synthesis: A sound analysis/synthesis based on a deterministic plus stochastic decomposition , 1990 .

[16]  William M. Hartmann,et al.  Psychoacoustics: Facts and Models , 2001 .