From Sound to ‘ Sense ’ via Feature Extraction and Machine Learning : Deriving High-Level Descriptors for Characterising Music

Research in intelligent music processing is experiencing an enormous boost these days due to the emergence of the new application and research field of Music Information Retrieval (MIR). The rapid growth of digital music collections and the concomitant shift of the music market towards digital music distribution urgently call for intelligent computational support in the automated handling of large amounts of digital music. Ideas for a large variety of content-based music services are currently being developed in music industry and in the research community. They range from content-based music search engines to automatic music recommendation services, from intuitive interfaces on portable music players to methods for the automatic structuring and visualisation of large digital music collections, and from personalised radio stations to tools that permit the listener to actively modify and ‘play with’ the music as it is being played. What all of these content-based services have in common is that they require the computer to be able to ‘make sense of’ and ‘understand’ the actual content of the music, in the sense of being able to recognise and extract musically, perceptually and contextually meaningful (‘semantic’) patterns from recordings, and to associate descriptors with the music that make sense to human listeners. There is a large variety of musical descriptors that are potentially of interest. They range from low-level features of the sound, such as its bass content or its harmonic richness, to high-level concepts such as “hip hop” or “sad music”. Also, semantic descriptors may come in the form of atomic, discrete labels like “rhythmic” or “waltz”, or they may be complex, structured entities such as harmony and rhythmic structure. As it is impossible to cover all of these in one coherent chapter, we will have to limit ourselves to a particular class of semantic desciptors. This chapter, then, focuses on methods for automatically extracting high-level atomic descriptors for the characterisation of music. It will be shown how high-level terms can be inferred via a combination of bottom-up audio descriptor extraction and the application of machine learning algorithms. Also, it will be shown that meaningful descriptors can be extracted not just from an analysis of the music (audio) itself, but also from extra-musical sources, such as the internet (via ‘web mining’).

[1]  Shigeo Abe DrEng Pattern Classification , 2012, Springer London.

[2]  Gerhard Widmer,et al.  Hierarchical Organization and Description of Music Collections at the Artist Level , 2005, ECDL.

[3]  Katharina Morik,et al.  Automatic Feature Extraction for Classifying Audio Data , 2005, Machine Learning.

[4]  Simon Dixon,et al.  A Review of Automatic Rhythm Description Systems , 2005, Computer Music Journal.

[5]  Alessandro Lameiras Koerich,et al.  Automatic classification of audio data , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[6]  Anssi Klapuri,et al.  Automatic Music Transcription as We Know it Today , 2004 .

[7]  Gerhard Widmer,et al.  Evaluating Rhythmic descriptors for Musical Genre Classification , 2004 .

[8]  Gerhard Widmer,et al.  Exploring Music Collections by Browsing Different Views , 2004, Computer Music Journal.

[9]  Eric Allamanche,et al.  Automatic Optimization of a Music Similarity Metric using Similarity Pairs , 2004 .

[10]  Christian Dittmar,et al.  Drum Pattern Based Genre Classification of Popular Music , 2004 .

[11]  Nicolas Lefebvre,et al.  Music Genre Estimation from Low Level Audio Features , 2004 .

[12]  Masataka Goto,et al.  SmartMusicKIOSK: music listening station with chorus-search function , 2003, UIST '03.

[13]  Gerhard Widmer,et al.  Classification of dance music by periodicity patterns , 2003, ISMIR.

[14]  Tao Li,et al.  Detecting emotion in music , 2003, ISMIR.

[15]  Lie Lu,et al.  Automatic mood detection from acoustic music data , 2003, ISMIR.

[16]  Tao Li,et al.  Factors in automatic musical genre classification of audio signals , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[17]  Oliver Hummel,et al.  Using cultural metadata for artist recommendations , 2003, Proceedings Third International Conference on WEB Delivering of Music.

[18]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[19]  Qi Tian,et al.  Musical genre classification using support vector machines , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[20]  Anssi Klapuri,et al.  Melody Description and Extraction in the Context of Music Content Processing , 2003 .

[21]  Elias Pampalk,et al.  Content-based organization and visualization of music archives , 2002, MULTIMEDIA '02.

[22]  François Pachet,et al.  Scaling up music playlist generation , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[23]  J. Aucouturier,et al.  Music Similarity Measures: What's the use? , 2002, ISMIR.

[24]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[25]  Tong Zhang An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods , 2001, AI Mag..

[26]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[27]  C. Lee Giles,et al.  Accessibility of information on the web , 1999, Nature.

[28]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[29]  Pasi Koikkalainen,et al.  Self-organizing hierarchical feature maps , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[30]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[31]  G. Widmer,et al.  Multiple Lyrics Alignment: Automatic Retrieval of Song Lyrics , 2005, ISMIR.

[32]  Gerhard Widmer,et al.  HIDDEN MARKOV MODELS FOR SPECTRAL SIMILARITY OF SONGS , 2005 .

[33]  Masataka Goto,et al.  Automatic Drum Sound Description for Real-World Music Using Template Adaptation and Matching Methods , 2004, ISMIR.

[34]  Emilia Gómez,et al.  Estimating The Tonality Of Polyphonic Audio Files: Cognitive Versus Machine Learning Modelling Strategies , 2004, ISMIR.

[35]  Daniel P. W. Ellis,et al.  Automatic Record Reviews , 2004, ISMIR.

[36]  Miguel A. Alonso,et al.  Tempo And Beat Estimation Of Musical Signals , 2004, ISMIR.

[37]  J. Stephen Downie,et al.  The International Music Information Retrieval Systems Evaluation Laboratory: Governance, Access and Security , 2004, ISMIR.

[38]  Takuya Yoshioka,et al.  Automatic Chord Transcription with Concurrent Recognition of Chord Symbols and Boundaries , 2004, ISMIR.

[39]  Stephen Cox,et al.  Features and classifiers for the automatic classification of musical audio signals , 2004, ISMIR.

[40]  Guy J. Brown,et al.  Extracting Melody Lines From Complex Audio , 2004, ISMIR.

[41]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[42]  Peter Knees,et al.  Artist Classification with Web-Based Data , 2004, ISMIR.

[43]  Gerhard Widmer,et al.  Towards Characterisation of Music via Rhythmic Patterns , 2004, ISMIR.

[44]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.

[45]  Juan José Burred,et al.  A HIERARCHICAL APPROACH TO AUTOMATIC MUSICAL GENRE CLASSIFICATION , 2003 .

[46]  Paris Smaragdis,et al.  Combining Musical and Cultural Features for Intelligent Style Detection , 2002, ISMIR.

[47]  Steve Lawrence,et al.  Inferring Descriptions and Similarity for Music from Community Metadata , 2002, ICMC.

[48]  Masataka Goto,et al.  A Real-time Music Scene Description System: Detecting Melody and Bass Lines in Audio Signals , 1999 .

[49]  Samuel Kaski,et al.  Keyword selection method for characterizing text document maps , 1999 .

[50]  Curtis Roads,et al.  The Computer Music Tutorial , 1996 .

[51]  Risto Miikkulainen,et al.  Script Recognition with Hierarchical Feature Maps , 1992 .

[52]  Perfecto Herrera-Boyer,et al.  Please Scroll down for Article Journal of New Music Research Automatic Classification of Musical Instrument Sounds , 2022 .