论文信息 - Identification of versions of the same musical composition by processing audio descriptions

Identification of versions of the same musical composition by processing audio descriptions

Automatically making sense of digital information, and specially of music digital documents, is an important problem our modern society is facing. In fact, there are still many tasks that, although being easily performed by humans, cannot be effectively performed by a computer. In this work we focus on one of such tasks: the identification of musical piece versions (alternate renditions of the same musical composition like cover songs, live recordings, remixes, etc.). In particular, we adopt a computational approach solely based on the information provided by the audio signal. We propose a system for version identification that is robust to the main musical changes between versions, including timbre, tempo, key and structure changes. Such a system exploits nonlinear time series analysis tools and standard methods for quantitative music description, and it does not make use of a specific modeling strategy for data extracted from audio, i.e. it is a model-free system. We report remarkable accuracies for this system, both with our data and through an international evaluation framework. Indeed, according to this framework, our model-free approach achieves the highest accuracy among current version identification systems (up to the moment of writing this thesis). Model-based approaches are also investigated. For that we consider a number of linear and nonlinear time series models. We show that, although model-based approaches do not reach the highest accuracies, they present a number of advantages, specially with regard to computational complexity and parameter setting. In addition, we explore post-processing strategies for version identification systems, and show how unsupervised grouping algorithms allow the characterization and enhancement of the output of query-by-example systems such as the version identification ones. To this end, we build and study a complex network of versions and apply clustering and community detection algorithms. Overall, our work brings automatic version identification to an unprecedented stage where high accuracies are achieved and, at the same time, explores promising directions for future research. Although our steps are guided by the nature of the considered signals (music recordings) and the characteristics of the task at hand (version identification), we believe our methodology can be easily transferred to other contexts and domains.

Joan Serrà | J. Serrà

[1] Xavier Serra,et al. Characterization and exploitation of community structure in cover song networks , 2011, Pattern Recognit. Lett..

[2] Daniel P. W. Ellis,et al. Quantitative Analysis of a Common Audio Similarity Measure , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Paul Lamere,et al. A Model-Based Approach to Constructing Music Similarity Functions , 2007, EURASIP J. Adv. Signal Process..

[4] Chin-Hui Lee,et al. Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[5] Joan Serrà,et al. From Low-Level to High-Level: Comparative Study of Music Similarity Measures , 2009, 2009 11th IEEE International Symposium on Multimedia.

[6] Katharina Morik,et al. Automatic Feature Extraction for Classifying Audio Data , 2005, Machine Learning.

[7] Dan Gusfield,et al. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[8] J. Huke. Embedding Nonlinear Dynamical Systems: A Guide to Takens' Theorem , 2006 .

[9] Andrey Temko,et al. Fuzzy integral based information fusion for classification of highly confusable non-speech sounds , 2008, Pattern Recognit..

[10] N. Scaringella,et al. Automatic genre classification of music content: a survey , 2006, IEEE Signal Process. Mag..

[11] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[12] Dimitris Kugiumtzis,et al. State Space Reconstruction for Multivariate Time Series Prediction , 2008, 0809.2220.

[13] Daniel P. W. Ellis,et al. Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[14] François Pachet,et al. Knowledge Management and Musical Metadata , 2005 .

[15] Lawrence M. Zbikowski. Conceptualizing Music: Cognitive Structure, Theory, and Analysis , 2002 .

[16] Nicola Orio,et al. Music Retrieval: A Tutorial and Review , 2006, Found. Trends Inf. Retr..

[17] Varol Akman,et al. Turing Test: 50 Years Later , 2000, Minds and Machines.

[18] Jonathan Foote,et al. Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[19] Emilia Gómez,et al. Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond , 2010, Advances in Music Information Retrieval.

[20] Jonathan Foote,et al. ARTHUR: Retrieving Orchestral Music by Long-Term Structure , 2000, ISMIR.

[21] Özgür Izmirli,et al. Tonal Similarity from Audio Using a Template Based Attractor Model , 2005, ISMIR.

[22] E. Chew. Towards a mathematical model of tonality , 2000 .

[23] Andreas Groth. Visualization of coupling in time series by order recurrence plots. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24] Andreas S. Weigend,et al. Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[25] François Pachet,et al. The Continuator: Musical Interaction With Style , 2003, ICMC.

[26] Intertextuality revisited: Dialogues and Negotiations in Media Studies , 1999 .

[27] J. A. Stewart,et al. Nonlinear Time Series Analysis , 2015 .

[28] Jason Farquhar,et al. Name that tune: Decoding music from the listening brain , 2011, NeuroImage.

[29] Xavier Serra,et al. Statistical Analysis of Chroma Features in Western Music Predicts Human Judgments of Tonality , 2008 .

[30] George M. Church,et al. Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[31] Kyogu Lee,et al. Identifying Cover Songs from Audio Using Harmonic Representation , 2006 .

[32] Schreiber,et al. Measuring information transfer , 2000, Physical review letters.

[33] Efstathios Stamatatos,et al. A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[34] Farmer,et al. Predicting chaotic time series. , 1987, Physical review letters.

[35] P. Young,et al. Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[36] Daniel P. W. Ellis,et al. The 2007 LabROSA Cover Song Detection System , 2007 .

[37] Hsin-Min Wang,et al. Using the Similarity of Main Melodies to Identify Cover Versions of Popular Songs for Music Document Retrieval , 2008, J. Inf. Sci. Eng..

[38] Ramesh C. Jain,et al. A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video , 2002, Pattern Recognit..

[39] Sophie Ahrens,et al. Recommender Systems , 2012 .

[40] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.

[41] Emilia Gómez,et al. Estimating The Tonality Of Polyphonic Audio Files: Cognitive Versus Machine Learning Modelling Strategies , 2004, ISMIR.

[42] Daniel P. W. Ellis,et al. Chord segmentation and recognition using EM-trained hidden markov models , 2003, ISMIR.

[43] Ricardo A. Baeza-Yates,et al. Searching in metric spaces , 2001, CSUR.

[44] David Gerhard,et al. Audio Visualization in Phase Space , 1999 .

[45] Xavier Serra,et al. Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[46] Meinard Müller,et al. Path-constrained partial music synchronization , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[47] C. Stevens,et al. Sweet Anticipation: Music and the Psychology of Expectation, by David Huron . Cambridge, Massachusetts: MIT Press, 2006 , 2007 .

[48] Joan Serrà,et al. Music Mood Representations from Social Tags , 2009, ISMIR.

[49] Jonathan Foote,et al. Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[50] Javier M Buldú,et al. Community structures and role detection in music networks. , 2008, Chaos.

[51] Joshua D. Reiss,et al. NONLINEAR TIME SERIES ANALYSIS OF MUSICAL SIGNALS , 2003 .

[52] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[53] C L Webber,et al. Dynamical assessment of physiological systems and states using recurrence plot strategies. , 1994, Journal of applied physiology.

[54] Joan Serrà,et al. Nonlinear audio recurrence analysis with application to genre classification , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[55] Aniruddh D. Patel. Music, Language, and the Brain , 2007 .

[56] Geoffroy Peeters. Sequence Representation of Music Structure Using Higher-Order Similarity Matrix and Maximum-Likelihood Approach , 2007, ISMIR.

[57] Joan Serrà,et al. Music Mood Annotator Design and Integration , 2009, 2009 Seventh International Workshop on Content-Based Multimedia Indexing.

[58] R. Andrzejak,et al. Cross recurrence quantification for cover song identification , 2009 .

[59] Richard Cohn. Neo-Riemannian Operations, Parsimonious Trichords, and Their "Tonnetz" Representations , 1997 .

[60] Ingo Mierswa,et al. Understandable models Of music collections based on exhaustive feature generation with temporal statistics , 2006, KDD '06.

[61] Emilia Gómez Gutiérrez,et al. Tonal description of music audio signals , 2006 .

[62] Xavier Rodet,et al. Toward Automatic Music Audio Summary Generation from Signal Analysis , 2002, ISMIR.

[63] F. Bailes. Dynamic melody recognition: Distinctiveness and the role of musical expertise , 2010, Memory & cognition.

[64] S. Strogatz. Exploring complex networks , 2001, Nature.

[65] W. Ebeling. Stochastic Processes in Physics and Chemistry , 1995 .

[66] Geoffroy Peeters,et al. Large-Scale Study of Chord Estimation Algorithms Based on Chroma Representation and HMM , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.

[67] Jean-François Paiement,et al. Predictive models for music , 2009, Connect. Sci..

[68] Richard Taylor,et al. Authenticating Pollock paintings using fractal geometry , 2007, Pattern Recognit. Lett..

[69] Micheline Lesaffre,et al. {Music Information Retrieval - Conceptual Framework, Annotation and User Behaviour} , 2005 .

[70] Timothy Q. Gentner,et al. Working memory for patterned sequences of auditory objects in a songbird , 2010, Cognition.

[71] Daniel P. W. Ellis,et al. A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[72] Gerhard Widmer,et al. MATCH: A Music Alignment Tool Chest , 2005, ISMIR.

[73] Jean-Loup Guillaume,et al. Fast unfolding of communities in large networks , 2008, 0803.0476.

[74] Xavier Serra,et al. Indexing music by mood: design and integration of an automatic content-based annotator , 2010, Multimedia Tools and Applications.

[75] C. Granger. Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[76] Emilia Gómez,et al. Transposing Chroma Representations to a Common Key , 2008 .

[78] George Tzanetakis,et al. Polyphonic audio matching and alignment for music retrieval , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[79] Eleanor Selfridge-Field,et al. Conceptual and representational issues in melodic comparison , 1998 .

[80] Maarten Grachten,et al. Melodic Similarity: Looking for a Good Abstraction Level , 2004, ISMIR.

[81] Mikel Gainza,et al. Automatic musical meter detection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[82] George Tzanetakis,et al. Pitch Histograms in Audio and Symbolic Music Information Retrieval , 2003, ISMIR.

[83] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[84] Mert Bay,et al. Audio Cover Song Identification: MIREX 2006-2007 Results and Analyses , 2008, ISMIR.

[85] H. Tong,et al. Threshold Autoregression, Limit Cycles and Cyclical Data , 1980 .

[86] Kjell Lemström,et al. Identifying cover songs using normalized compression distance , 2008 .

[87] Alexandr Andoni,et al. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[88] Matija Marolt,et al. A Mid-Level Representation for Melody-Based Retrieval in Audio Collections , 2008, IEEE Transactions on Multimedia.

[89] Mark D. Plumbley,et al. Information dynamics: patterns of expectation and surprise in the perception of music , 2009, Connect. Sci..

[90] Julián Urbano,et al. Using the Shape of Music to Compute the Similarity between Symbolic Musical Pieces , 2010 .

[91] Meinard Müller,et al. Enhancing Similarity Matrices for Music Audio Analysis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[92] I.N. Bozkurt,et al. Authorship attribution , 2007, 2007 22nd international symposium on computer and information sciences.

[93] Hideki Kawahara,et al. Comparative evaluation of F0 estimation algorithms , 2001, INTERSPEECH.

[94] Graham E. Poliner,et al. Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[95] Joan Serrà. A Qualitative Assessment of Measures for the Evaluation of a Cover Song Identification System , 2007, ISMIR.

[96] Emilia Gómez,et al. Music classification using high-level models , 2010 .

[97] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[98] Shlomo Dubnov,et al. Audio Oracle: a New Algorithm for Fast Learning of audio Structures , 2007, ICMC.

[99] Meinard Müller,et al. Joint Structure Analysis with Applications to Music Annotation and Synchronization , 2008, ISMIR.

[100] Michael A. Casey,et al. Separation of Mixed Audio Sources By Independent Subspace Analysis , 2000, ICMC.

[101] Claudio Castellano,et al. Community Structure in Graphs , 2007, Encyclopedia of Complexity and Systems Science.

[102] Jürgen Kurths,et al. Recurrence plots for the analysis of complex systems , 2009 .

[103] J. Zbilut,et al. Embeddings and delays as derived from quantification of recurrence plots , 1992 .

[104] G. C. Tiao,et al. An introduction to multiple time series analysis. , 1993, Medical care.

[105] A. Tversky. Features of Similarity , 1977 .

[106] J. Kurths,et al. Recurrence-plot-based measures of complexity and their application to heart-rate-variability data. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[107] Ron J. Weiss,et al. Exploring common variations in state of the art chord recognition systems , 2010 .

[108] Joydeep Ghosh,et al. A text retrieval approach to content-based audio retrieval , 2008 .

[109] Hsin-Min Wang,et al. Query-By-Example Technique for Retrieving Cover Versions of Popular Songs with Similar Melodies , 2005, ISMIR.

[110] Lothar Klein,et al. Tonality , 1969 .

[111] Ning Hu,et al. A comparative evaluation of search techniques for query-by-humming using the MUSART testbed , 2007, J. Assoc. Inf. Sci. Technol..

[112] Daniel P. W. Ellis,et al. A tempo-insensitive distance measure for cover song identification based on chroma features , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[113] X. Rodet. EFFICIENT SPECTRAL ENVELOPE ESTIMATION AND ITS APPLICATION TO PITCH SHIFTING AND ENVELOPE PRESERVATION , 2005 .

[114] Julien Allali,et al. Adaption of String Matching Algorithms for Identification of Near-Duplicate Music Documents , 2007, PAN.

[115] Xavier Serra,et al. Unifying Low-Level and High-Level Music Similarity Measures , 2011, IEEE Transactions on Multimedia.

[116] R. Andrzejak. Nonlinear Time Series Analysis in a Nutshell , 2011 .

[117] Marc Leman,et al. Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[118] Tim Crawford,et al. Harmonic models for polyphonic music retrieval , 2002, CIKM '02.

[119] Riccardo Miotto,et al. A Music Identification System Based on Chroma Indexing and Statistical Modeling , 2008, ISMIR.

[120] Meinard Müller,et al. Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[121] L. da F. Costa,et al. Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[122] David S. Broomhead,et al. Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[123] Leon Danon,et al. Comparing community structure identification , 2005, cond-mat/0505245.

[124] Peter Knees,et al. On Rhythm and General Music Similarity , 2009, ISMIR.

[125] Hae-Sang Park,et al. A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[126] Masataka Goto,et al. A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[127] P. van Kranenburg,et al. A Computational Approach to Content-Based Retrieval of Folk Song Melodies , 2010 .

[128] Mathieu Lagrange,et al. Unsupervised Accuracy Improvement for Cover Song Detection Using Spectral Connectivity Network , 2010, ISMIR.

[129] B. Ong. Structural analysis and segmentation of music signals , 2007 .

[130] Fernando Lopez-Lezcano,et al. Center for Computer Research in Music and Acoustics (CCRMA) , 1994, ICMC.

[131] M. Coyle. Hijacked Hits and Antic Authenticity: Cover Songs, Race, and Postwar Marketing , 2002, Rock Over the Edge.

[132] Climent Nadeu,et al. Time and frequency filtering of filter-bank energies for robust HMM speech recognition , 2000, Speech Commun..

[133] A. Barabasi,et al. Lethality and centrality in protein networks , 2001, Nature.

[134] Gregory H. Wakefield,et al. Audio thumbnailing of popular music using chroma-based representations , 2005, IEEE Transactions on Multimedia.

[135] B. Ripley,et al. Pattern Recognition , 1968, Nature.

[136] Joan Serrà,et al. Model-based cover song detection via threshold autoregressive forecasts , 2010, MML '10.

[137] Ricardo A. Baeza-Yates,et al. The Intention Behind Web Queries , 2006, SPIRE.

[138] Lei Chen,et al. Searching musical audio datasets by a batch of multi-variant tracks , 2008, MIR '08.

[139] Lars Kai Hansen,et al. Temporal Feature Integration for Music Genre Classification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[140] M. Engelmann. The Philosophical Investigations , 2013 .

[141] T. Saunders,et al. Theses , 2001 .

[142] Xavier Serra,et al. Musical Sound Modeling with Sinusoids plus Noise , 1997 .

[143] George Karypis,et al. Evaluation of hierarchical clustering algorithms for document datasets , 2002, CIKM '02.

[144] Aaron E. Rosenberg,et al. Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[145] Elaine Chew,et al. Statistical Modeling and Retrieval of Polyphonic Music , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[146] Joseph B. Kruskal,et al. Time Warps, String Edits, and Macromolecules , 1999 .

[147] Dmitry Bogdanov,et al. HYBRID SIMILARITY MEASURES FOR MUSIC RECOMMENDATION , 2009 .

[148] Ulrike von Luxburg,et al. A tutorial on spectral clustering , 2007, Stat. Comput..

[149] P. Roth,et al. SURVEY OF APPEARANCE-BASED METHODS FOR OBJECT RECOGNITION , 2008 .

[150] Juan Pablo Bello,et al. A Robust Mid-Level Representation for Harmonic Content in Music Signals , 2005, ISMIR.

[151] Davide Rocchesso,et al. Sound to Sense - Sense to Sound: A state of the art in Sound and Music Computing , 2008 .

[152] Mark E. J. Newman,et al. The Structure and Function of Complex Networks , 2003, SIAM Rev..

[153] Daniel P. W. Ellis,et al. Cross-correlation of beat-synchronous representations for music similarity , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[154] Daniel P. W. Ellis,et al. A Quantitative Comparison of Different Approaches for Melody Extraction from Polyphonic Audio Recordings , 2006 .

[155] Gonzalo Navarro,et al. Transposition invariant string matching , 2005, J. Algorithms.

[156] P. Juslin,et al. Toward a computational model of expression in music performance: The GERM model , 2001 .

[157] M. Schulkind,et al. Musical Features That Facilitate Melody Identification: How Do You Know It's ““Your”” Song When They Finally Play It? , 2003 .

[158] Alan Hanjalic,et al. Cover Song Retrieval: A Comparative Study of System Component Choices , 2009, ISMIR.

[159] B. Boashash,et al. Pattern recognition using invariants defined from higher order spectra: 2-D image inputs , 1997, IEEE Trans. Image Process..

[160] Alain de Cheveigné,et al. Pitch perception models , 2005 .

[161] Emilia Gómez,et al. Automatic Tonal Analysis from Music Summaries for Version Identification , 2006 .

[162] A. Giuliani,et al. Detecting deterministic signals in exceptionally noisy environments using cross-recurrence quantification , 1998 .

[163] R Hegger,et al. Denoising human speech signals using chaoslike features. , 2000, Physical review letters.

[164] Kurt Mosser. Cover Songs: Ambiguity, Multivalence, Polysemy , 2008 .

[165] 鐘期坂本,et al. Tonal Pitch Space を用いた楽曲の和声解析 , 2009 .

[166] Mark B. Sandler,et al. Polyphonic Score Retrieval Using Polyphonic Audio Queries: A Harmonic Modeling Approach , 2003, ISMIR.

[167] Takuya Fujishima,et al. Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.

[168] Xavier Serra,et al. SaxEx: a case-based reasoning system for generating expressive musical performances , 1998, ICMC.

[169] Elias Pampalk,et al. Computational Models of Music Similarity and their Application in Music Information Retrieval , 2006 .

[170] W. Dowling. Scale and contour: Two components of a theory of memory for melodies. , 1978 .

[171] Nell P. McAngusTodd,et al. The dynamics of dynamics: A model of musical expression , 1992 .

[172] Cheng Yang,et al. Music Database Retrieval Based on Spectral Similarity , 2001 .

[173] A. Vespignani,et al. The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[174] MIREX 2007 : AUDIO COVER SONG DETECTION USING CHROMA FEATURES AND A HIDDEN MARKOV MODEL , 2007 .

[175] Joan Serrà,et al. Shape-based spectral contrast descriptor , 2009 .

[176] Massimiliano Zanin,et al. Cover song retrieval by cross recurrence quantification and unsupervised set detection , 2009 .

[177] J. Welsh,et al. The Little Book of Plagiarism , 2008 .

[178] Michael R. Chernick,et al. Nonparametric Statistics, With Applications to Science and Engineering , 2008 .

[179] D. Harwood. Universals in Music: A Perspective from Cognitive Psychology , 1976 .

[180] D. Ruelle,et al. Recurrence Plots of Dynamical Systems , 1987 .

[181] Meinard Müller,et al. Towards Timbre-Invariant Audio Features for Harmony-Based Music , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[182] J. Serrà,et al. Assessing the results of a cover song identification system with coverSSSSearch , 2009 .

[183] Meinard Müller,et al. Information retrieval for music and motion , 2007 .

[184] Emilia Gómez,et al. A cover song identification system based on sequences of tonal descriptors , 2007 .

[185] Irène Deliège,et al. Cue Abstraction as a Component of Categorisation Processes in Music Listening , 1996 .

[186] M. Thiel,et al. Cross recurrence plot based synchronization of time series , 2002, physics/0201062.

[187] Marc Leman,et al. Music and Schema Theory : Cognitive Foundations of Systematic Musicology , 1995 .

[188] Emilia Gómez,et al. Improving binary similarity and local alignment for cover song detection , 2008 .

[189] I. Peretz,et al. Time course of melody recognition: A gating paradigm study , 2003, Perception & psychophysics.

[190] Elaine Chew,et al. Music Summarization Via Key Distributions: Analyses of Similarity Assessment Across Variations , 2006, ISMIR.

[191] E. Rosch,et al. Family resemblances: Studies in the internal structure of categories , 1975, Cognitive Psychology.

[192] A. M. Turing,et al. Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[193] Ramon López de Mántaras,et al. Melody retrieval using the Implication/Realization Model , 2005 .

[194] Teppo E. Ahonen. Combining Chroma Features For Cover Version Identification , 2010, ISMIR.

[195] Santo Fortunato,et al. Community detection in graphs , 2009, ArXiv.

[196] Timothy D. Sauer,et al. Attractor reconstruction , 2006, Scholarpedia.

[197] Kunio Kashino,et al. Fast music retrieval using polyphonic binary feature vectors , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[198] Esko Ukkonen,et al. Sweepline the Music! , 2003, Computer Science in Perspective.

[199] Fabian Mörchen,et al. Modeling timbre distance with temporal statistics from polyphonic music , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[200] James A. Panning. The Use of the Old Testament in the New Testament , 1995 .

[201] Meinard Müller,et al. Efficient Index-Based Audio Matching , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[202] Pedro Cano,et al. A Review of Audio Fingerprinting , 2005, J. VLSI Signal Process..

[203] Mathieu Lagrange,et al. Multimodal similarity between musical streams for cover version detection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[204] S. Mithen,et al. The Singing Neanderthals: the Origins of Music, Language, Mind and Body, by Steven Mithen. London: Weidenfeld & Nicholson, 2005. ISBN 0-297-64317-7 hardback £20 & US$25.2; ix+374 pp. , 2006, Cambridge Archaeological Journal.

[205] E. Batlle,et al. Automatic Song Identification in Noisy Broadcast Audio , 2002 .

[206] Emilia Gómez,et al. Automatic Extraction of Musical Structure Using Pitch Class Distribution Features , 2006 .

[207] Markus Koppenberger,et al. Topology of music recommendation networks. , 2006, Chaos.

[208] V Latora,et al. Efficient behavior of small-world networks. , 2001, Physical review letters.

[209] Daniel Müllensiefen,et al. Court decisions on music plagiarism and the predictive value of similarity algorithms , 2009 .

[210] Julius O. Smith,et al. A system for acoustic chord transcription and key extraction from audio using hidden Markov models trained on synthesized audio , 2008 .

[211] Gaël Richard,et al. Temporal Integration for Audio Classification With Application to Musical Instrument Classification , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[212] Haikady N. Nagaraja,et al. Inference in Hidden Markov Models , 2006, Technometrics.

[213] Shrikanth S. Narayanan,et al. Dynamic chroma feature vectors with applications to cover song identification , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[214] Tuire Kuusi. Tune Recognition from Melody, Rhythm and Harmony , 2009 .

[215] Juan Pablo Bello,et al. Audio-Based Cover Song Retrieval Using Approximate Chord Sequences: Testing Shifts, Gaps, Swaps and Beats , 2007, ISMIR.

[216] Ricardo A. Baeza-Yates,et al. Fast and Practical Approximate String Matching , 1992, Inf. Process. Lett..

[217] Daniel P. W. Ellis,et al. Cover song detection: From high scores to general classification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[218] D. Oller,et al. Innateness, Experience, and Music Perception , 1990 .

[219] Christine R. Yano. Covering Disclosures: Practices of Intimacy, Hierarchy, and Authenticity in a Japanese Popular Music Genre , 2005 .

[220] Rainer Typke,et al. Music Retrieval based on Melodic Similarity , 2007 .

[221] Enric Guaus,et al. The Discipline formerly known as MIR , 2009 .

[222] Nicola Orio,et al. A scalable cover identification engine , 2010, ACM Multimedia.

[223] F. Takens. Detecting strange attractors in turbulence , 1981 .

[224] H. Kantz,et al. Recurrence plot analysis of nonstationary data: the understanding of curved patterns. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[225] R. Jackendoff,et al. A Generative Theory of Tonal Music , 1985 .

[226] Zbigniew W. Ras,et al. Advances in Music Information Retrieval , 2012, Advances in Music Information Retrieval.

[227] François Pachet,et al. Improving Timbre Similarity : How high’s the sky ? , 2004 .

[228] Daniel Pressnitzer,et al. Rapid Formation of Robust Auditory Memories: Insights from Noise , 2010, Neuron.

[229] R. Conrad,et al. Order error in immediate recall of sequences , 1965 .

[230] Janne Heikkilä,et al. A new class of shift-invariant operators , 2004, IEEE Signal Processing Letters.

[231] James L. McClelland,et al. Semantic Cognition: A Parallel Distributed Processing Approach , 2004 .

[232] Ramayya Krishnan,et al. Incremental hierarchical clustering of text documents , 2006, CIKM '06.

[233] Massimiliano Zanin,et al. Cover song networks: analysis and accuracy increase , 2011 .

[234] Xavier Serra,et al. What/when causal expectation modelling applied to audio signals , 2009, Connect. Sci..

[235] Gregory H. Wakefield,et al. Time Series Alignment for Music Information Retrieval , 2004, ISMIR.

[236] J. Stephen Downie,et al. The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008, Acoustical Science and Technology.

[237] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[238] A. W. M. van den Enden,et al. Discrete Time Signal Processing , 1989 .

[239] Hoon Kim,et al. Monte Carlo Statistical Methods , 2000, Technometrics.

[240] Miguel Molina-Solana,et al. Identifying violin performers by their expressive trends , 2010, Intell. Data Anal..

[241] N. Marwan,et al. Nonlinear analysis of bivariate data with cross recurrence plots , 2002, physics/0201061.

[242] James M. Keller,et al. Information fusion in computer vision using the fuzzy integral , 1990, IEEE Trans. Syst. Man Cybern..

[243] George Plasketes. Re‐flections on the Cover Age: A Collage of Continuous Coverage in Popular Music , 2005 .

[244] J. M. Hughes,et al. Quantification of artistic style through sparse coding analysis in the drawings of Pieter Bruegel the Elder , 2010, Proceedings of the National Academy of Sciences.

[245] Joan Serrà,et al. MUSIC TYPE GROUPERS (MTG): GENERIC MUSIC CLASSIFICATION ALGORITHMS , 2009 .

[246] Daniel Povey,et al. Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[247] C. Krumhansl. Music as Cognition. , 1987 .

[248] William H. Press,et al. Numerical recipes , 1990 .

[249] Meinard Müller,et al. Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations , 2007, EURASIP J. Adv. Signal Process..

[250] C. Harte,et al. Detecting harmonic change in musical audio , 2006, AMCMM '06.

[251] Daniel J. Levitin,et al. This is your brain on music : the science of a human obsession , 2006 .

[252] J. A. Almendral,et al. The complex network of musical tastes , 2007 .

[253] Ricardo A. Baeza-Yates,et al. Characterization of national Web domains , 2007, TOIT.

[254] Climent Nadeu,et al. Comparison and combination of features in a hybrid HMM/MLP and a HMM/GMM speech recognition system , 2005, IEEE Transactions on Speech and Audio Processing.

[255] Michael A. Casey,et al. The Importance of Sequences in Musical Similarity , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[256] Reinhard Klette,et al. Handbook of image processing operators , 1996 .

[257] Shrikanth S. Narayanan,et al. Music fingerprint extraction for classical music cover song identification , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[258] George Tzanetakis,et al. An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[259] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[260] Azriel Rosenfeld,et al. Face recognition: A literature survey , 2003, CSUR.

[261] M. Newman,et al. Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[262] James Kalbach. Understanding information systems: What they do and why we need them , 2005, J. Assoc. Inf. Sci. Technol..

[263] Arun Ross,et al. Information fusion in biometrics , 2003, Pattern Recognit. Lett..

[264] Cory S Myers,et al. A comparative study of several dynamic time warping algorithms for speech recognition , 1980 .

[265] Barry Vercoe,et al. Automated analysis of musical structure , 2005 .

[266] Shlomo Dubnov,et al. Spectral Anticipations , 2006, Computer Music Journal.

[267] T. Eerola,et al. Statistical Features and Perceived Similarity of Folk Melodies , 2001 .

[268] Emilia Gómez,et al. The song remains the same: identifying versions of the same piece using tonal descriptors , 2006, ISMIR.

[269] G. Solis. I Did It My Way: Rock and the Logic of Covers , 2010 .

[270] Malcolm Slaney,et al. Analysis of Minimum Distances in High-Dimensional Musical Spaces , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[271] Xavier Serra,et al. Predictability of Music Descriptor Time Series and its Application to Cover Song Detection , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[272] Richard N. Henson,et al. Serial order in short-term memory , 2001 .

[273] Paul Iverson,et al. Name that tune: Identifying popular recordings from brief excerpts , 1999, Psychonomic bulletin & review.

[274] Matija Marolt,et al. A Mid-level Melody-based Representation for Calculating Audio Similarity , 2006, ISMIR.

[275] I. Nelken. Demonstrations of Auditory Scene Analysis: The Perceptual Organization of Sound by Albert S. Bregman and Pierre A. Ahad, MIT Press, 1996. £15.95 CD , 1997, Trends in Neurosciences.

[276] Charles Cronin. The music plagiarism digital archive at Columbia Law Library; an effort to demystify music copyright infringement , 2002, Second International Conference on Web Delivering of Music, 2002. WEDELMUSIC 2002. Proceedings..

[277] David Rizo,et al. Ensemble of state-of-the-art methods for polyphonic music comparison , 2009 .

[278] Smith,et al. Mathematics of the Discrete Fourier Transform (DFT) with Audio Applications , 2007 .

[279] H. Kantz,et al. Optimizing of recurrence plots for noise reduction. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[280] E J Meakins,et al. Variations , 1984 .