Anchor space for classification and similarity measurement of music

This paper describes a method of mapping music into a semantic space that can be used for similarity measurement, classification, and music information retrieval. The value along each dimension of this anchor space is computed as the output from a pattern classifier which is trained to measure a particular semantic feature. In anchor space, distributions that represent objects such as artists or songs are modeled with Gaussian mixture models, and several similarity measures are defined by computing approximations to the Kullback-Leibler divergence between distributions. Similarity measures are evaluated against human similarity judgements. The models are also used for artist classification to achieve 62% accuracy on a 25-artist set, and 38% on a 404-artist set (random guessing achieves 0.25%). Finally, we describe a music similarity browsing application that makes use of the fact that anchor space dimensions are meaningful to users.

[1]  Ryan M. Rifkin,et al.  Musical query-by-description as a multiclass learning problem , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[2]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[3]  Daniel P. W. Ellis,et al.  The Quest for Ground Truth in Musical Artist Similarity , 2002, ISMIR.

[4]  Nuno Vasconcelos,et al.  On the complexity of probabilistic image retrieval , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Joydeep Ghosh,et al.  A Supra-Classifier Architecture for Scalable Knowledge Reuse , 1998, ICML.

[6]  Daniel P. W. Ellis,et al.  USING VOICE SEGMENTS TO IMPROVE ARTIST CLASSIFICATION OF MUSIC , 2002 .

[7]  Steve Lawrence,et al.  Artist detection in music with Minnowmatch , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[8]  Malcolm Slaney,et al.  Mixtures of probability experts for audio retrieval and indexing , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.