Audio-Based Melody Categorization: Exploring Signal Representations and Evaluation Strategies

Melody categorization refers to the task of grouping a set of melodies into categories of similar items that originate from the same melodic contour. From a computational perspective, automatic melody categorization is of crucial importance for the automatic organization of databases, as well as for large-scale musicological studies—in particular, in the context of folk music and non-Western music traditions. We investigate methods starting from the raw audio file. For each recording contained in a collection, we extract a pitch sequence representing the main melodic line. We then estimate pairwise similarities and evaluate the discriminative power of the resulting similarity matrix with respect to ground-truth annotations. We propose novel evaluation methodologies, compare melody representations, and explore the potential of our approach in the context of two applications: interstyle and intrastyle categorization of flamenco music and tune-family recognition of folk-song recordings.

[1]  Emilia Gómez,et al.  Computational Models for Perceived Melodic Similarity in A Cappella Flamenco Singing , 2014, ISMIR.

[2]  Frans Wiering,et al.  The Meertens Tune Collections , 2014 .

[3]  Stephen McAdams,et al.  Perception of Musical Similarity Among Contemporary Thematic Materials in Two Instrumentations , 2004 .

[4]  Emilia Gómez,et al.  Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[6]  P. van Kranenburg,et al.  A Computational Approach to Content-Based Retrieval of Folk Song Melodies , 2010 .

[7]  David G. Stork,et al.  Pattern Classification , 1973 .

[8]  M. Jacomy,et al.  ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software , 2014, PloS one.

[9]  B. Bronson,et al.  Prolegomena to a Study of the Principal Melodic Families of British-American Folk Song , 1950 .

[10]  José Miguel Díaz-Báñez,et al.  Towards Flamenco Style Recognition: the Challenge of Modelling the Aficionado , 2016 .

[11]  Daniel P. W. Ellis,et al.  Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challenges , 2014, IEEE Signal Processing Magazine.

[12]  E. Gómez,et al.  Flamenco Music and Its Computational Study , 2018, Mathematical Music Theory.

[13]  Hiroshi G. Okuno,et al.  Bayesian Audio-to-Score Alignment Based on Joint Inference of Timbre, Volume, Tempo, and Note Onset Timings , 2015, Computer Music Journal.

[14]  José Miguel Díaz-Báñez,et al.  Fitting rectilinear polygonal curves to a set of points in the plane , 2001, Eur. J. Oper. Res..

[15]  Emilia Gómez,et al.  Comparative Melodic Analysis of A Cappella Flamenco Cantes , 2008 .

[16]  James R. Hopgood,et al.  Nonconcurrent multiple speakers tracking based on extended Kalman particle filter , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Emilia Gómez,et al.  Computational Ethnomusicology: A Study of Flamenco and Arab-Andalusian Vocal Music , 2018 .

[18]  Simon Dixon,et al.  PYIN: A fundamental frequency estimator using probabilistic threshold distributions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Jordi Bonada,et al.  Predominant Fundamental Frequency Estimation vs Singing Voice Separation for the Automatic Transcription of Accompanied Flamenco Singing , 2012, ISMIR.

[20]  George Tzanetakis,et al.  Visualization in Audio-Based Music Information Retrieval , 2006, Computer Music Journal.

[21]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[22]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[23]  Emilia Gómez,et al.  Automatic Transcription of Flamenco Singing From Polyphonic Music Recordings , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[24]  Xavier Serra,et al.  An evaluation of methodologies for melodic similarity in audio recordings of Indian art music , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[26]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[27]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[28]  José Miguel Díaz-Báñez,et al.  An Efficient DTW-Based Approach for Melodic Similarity in Flamenco Singing , 2014, SISAP.

[29]  F. Wiering,et al.  A Comparison between Global and Local Features for Computational Classification of Folk Song Melodies , 2013 .

[30]  R. Parncutt,et al.  Proceedings of the Conference on Interdisciplinary Musicology , 2004 .

[31]  Anja Volk,et al.  Melodic similarity among folk songs: An annotation study on similarity-based categorization in music , 2012 .

[32]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[33]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[34]  José Miguel Díaz-Báñez,et al.  Characterization and Similarity in A Cappella Flamenco Cantes , 2010, ISMIR.

[35]  George Tzanetakis,et al.  A Computational Approach to the Modeling and Employment of Cognitive Units of Folk Song Melodies Using Audio Recordings , 2010 .

[36]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[37]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[38]  Emilia Gómez,et al.  Fundamental frequency alignment vs. note-based melodic similarity for singing voice assessment , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[39]  GomezEmilia,et al.  Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics , 2012 .

[40]  José Miguel Díaz-Báñez,et al.  Melodic Contour and Mid-Level Global Features Applied to the Analysis of Flamenco Cantes , 2015, ArXiv.

[41]  L. Hubert,et al.  Comparing partitions , 1985 .

[42]  Graham E. Poliner,et al.  Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[43]  Daniel Müllensiefen,et al.  Cognitive Adequacy in the Measurement of Melodic Similarity: Algorithmic vs. Human Judgments , 2004 .