An Explorative, Hierarchical User Interface to Structured Music Repositories

Due to efficient compression algorithms like MP3, the number and size of digital music repositories have increased dramatically over the past few years. Consequently, the demand for obtaining digital music from the Internet has been rising. Hence, effective methods for finding pieces of music in such repositories are becoming more and more important. Unfortunately, when working with traditional user interfaces which solely provide text-based search, the user already has to know certain textual properties of the songs he/she is looking for (e.g. name of the artist or album). In contrast, the prototype of the user interface which has been developed by the author is based on graphical visualizations of musical similarities between the pieces contained in the repository. This enables the user to exploratively browse through the collection, an approach which is especially useful for discovering formerly unknown pieces of music. In order to provide different views of the music collection, two algorithms were chosen to process the audio signals. These algorithms measure musical similarities according to rhythmic and timbral properties. The developed user interface “ViSMuC” (Visualization of Structured Music Collections) implements an artificial intelligence method called Aligned Self-Organizing Maps in which high-dimensional data is represented by a 2-dimensional map. The pieces of music are visualized according to an adjustable weighting of their rhythmic and timbral properties. Forming clusters of similar pieces, the resulting groups are colored with respect to the number of songs they represent. Different colormaps are available for this purpose. Since illustrating all pieces of a medium or large collection on a single map would yield a tremendously complex and thus unusable visualization, the user interface contains two hierarchical components. Firstly, for each region of the map that represents a large number of songs, a new map is provided. Secondly, the directory structure of the repository is taken into account since it usually forms a meaningful hierarchy. Another important part of the user interface is the visualization of arbitrary meta-information, which can be taken, for example, from ID3-attributes or external databases. The employed technique illustrates the distribution of the values assigned to the metainformation attributes over the complete map. Together with visualizations that are based on the features gained from the similarity measures and their projection to the map, the images showing these distributions facilitate the interpretation of the map. For the purpose of testing the user interface, a test repository composed of more than 800 MP3-files was created and various meta-data was inserted into a database by the author. Finally, a short usability study was conducted and suggestions for applications as well as for improvement of the prototype were elaborated. Motivation and Introduction Over the past few years, the demand for digitally stored music has risen drastically. The availability of algorithms for music compression, especially MPEG-Layer 3 (MP3), together with high-speed Internet access, yielded an enormous increase in digital music distribution (DMD). The growing number of large music databases, which are very important for commercial music stores like AMG All Music Guide 1, Amazon 2 or iTunes 3, just to name a few, raises the demand for methods to efficiently browse through and search in such repositories. Most of the existing interfaces perform quite well when the task is to find music by a given artist or on a specified album, i.e. when the user knows exactly what he/she is looking for, but are unsuitable to support the user in discovering unknown music. For this reason, a user interface based on Self-Organizing Maps (SOMs) – neural networks used to cluster high-dimensional data – has been developed. The data consist 1http://www.allmusic.com 2http://www.amazon.at 3http://www.apple.com/itunes

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  E. B. Newman,et al.  A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .

[3]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[4]  E. Brigham,et al.  The fast Fourier transform , 2016, IEEE Spectrum.

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  B. Atal,et al.  Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .

[7]  H. Fastl Fluctuation strength and temporal masking patterns of amplitude-modulated broadband noise , 1982, Hearing Research.

[8]  Andreas Buja,et al.  Interactive data visualization using focusing and linking , 1991, Proceeding Visualization '91.

[9]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[10]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[11]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[12]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[13]  Thomas Boutell,et al.  PNG (Portable Network Graphics) Specification Version 1.0 , 1997, RFC.

[14]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[15]  Samuel Kaski,et al.  Bibliography of Self-Organizing Map (SOM) Papers: 1981-1997 , 1998 .

[16]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[17]  Samuel Kaski,et al.  Methods for interpreting a self-organized map in data analysis , 1998, ESANN.

[18]  Karlheinz Brandenburg,et al.  MP3 and AAC Explained , 1999 .

[19]  Nicholas J. Belkin,et al.  Innovation and evaluation of information: a CHI98 workshop , 1999, SGCH.

[20]  Juha Vesanto,et al.  SOM-based data visualization methods , 1999, Intell. Data Anal..

[21]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[22]  Allison Woodruff,et al.  Guidelines for using multiple views in information visualization , 2000, AVI '00.

[23]  E. C. Botha,et al.  On The Mel-scaled Cepstrum , 2000 .

[24]  George Tzanetakis,et al.  Automatic Musical Genre Classification of Audio Signals , 2001, ISMIR.

[25]  Emanuele Pollastri An Audio Front End for Query-by-Humming Systems , 2001, ISMIR.

[26]  Lakhmi C. Jain,et al.  Self-Organizing neural networks: recent advances and applications , 2001 .

[27]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[28]  Daniel P. W. Ellis,et al.  The Quest for Ground Truth in Musical Artist Similarity , 2002, ISMIR.

[29]  Elias Pampalk,et al.  Content-based organization and visualization of music archives , 2002, MULTIMEDIA '02.

[30]  Elias Pampalk,et al.  Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps , 2002, ICANN.

[31]  Pavel Berkhin,et al.  Learning Simple Relations: Theory and Applications , 2002, SDM.

[32]  François Pachet,et al.  FINDING SONGS THAT SOUND THE SAME , 2002 .

[33]  E. Pampalk Islands of Music Analysis, Organization, and Visualization of Music Archives , 2002 .

[34]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[35]  François Pachet,et al.  Music Similarity Measures: What's the use? , 2002, ISMIR.

[36]  Ann Blandford,et al.  Usability of Musical Digital Libraries: a Multimodal Analysis , 2002, ISMIR.

[37]  François Pachet,et al.  Content management for electronic music distribution , 2003, CACM.

[38]  Daniel P. W. Ellis,et al.  Toward Evaluation Techniques for Music Similarity , 2003, SIGIR 2003.

[39]  François Pachet,et al.  Representing Musical Genre: A State of the Art , 2003 .

[40]  Daniel P. W. Ellis,et al.  Anchor space for classification and similarity measurement of music , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[41]  E. Pampalk Aligned Self-Organizing Maps , 2003 .

[42]  Carlos Ordonez,et al.  Clustering binary data streams with K-means , 2003, DMKD '03.

[43]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[44]  Daniel P. W. Ellis,et al.  A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[45]  Gerhard Widmer,et al.  Exploring Music Collections by Browsing Different Views , 2004, Computer Music Journal.

[46]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.