论文信息 - Melody extraction from polyphonic music signals

Melody extraction from polyphonic music signals

Music was the first mass-market industry to be completely restructured by digital technology, and today we can have access to thousands of tracks stored locally on our smartphone and millions of tracks through cloud-based music services. Given the vast quantity of music at our fingertips, we now require novel ways of describing, indexing, searching and interacting with musical content. In this thesis we focus on a technology that opens the door to a wide range of such applications: automatically estimating the pitch sequence of the melody directly from the audio signal of a polyphonic music recording, also referred to as melody extraction. Whilst identifying the pitch of the melody is something human listeners can do quite well, doing this automatically is highly challenging. We present a novel method for melody extraction based on the tracking and characterisation of the pitch contours that form the melodic line of a piece. We show how different contour characteristics can be exploited in combination with auditory streaming cues to identify the melody out of all the pitch content in a music recording using both heuristic and model-based approaches. The performance of our method is assessed in an international evaluation campaign where it is shown to obtain state-of-the-art results. In fact, it achieves the highest mean overall accuracy obtained by any algorithm that has participated in the campaign to date. We demonstrate the applicability of our method both for research and end-user applications by developing systems that exploit the extracted melody pitch sequence for similarity-based music retrieval (version identification and query-by-humming), genre classification, automatic transcription and computational music analysis. The thesis also provides a comprehensive comparative analysis and review of the current state-of-the-art in melody extraction and a first of its kind analysis of melody extraction evaluation methodology.

Justin Salamon | J. Salamon

[1] Daniel P. W. Ellis,et al. Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challenges , 2014, IEEE Signal Processing Magazine.

[2] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[3] Xavier Serra,et al. Evaluation in Music Information Retrieval , 2013, Journal of Intelligent Information Systems.

[4] Ricard Marxer Piñón. Audio source separation for music in low-latency and high-latency scenarios , 2013 .

[5] A. Oxenham. The Perception of Musical Tones , 2013 .

[6] Mónica Marrero,et al. On the measurement of test collection reliability , 2013, SIGIR.

[7] Emilia Gómez,et al. Towards Computer-Assisted Flamenco Transcription: An Experimental Comparison of Automatic Transcription Algorithms as Applied to A Cappella Singing , 2013, Computer Music Journal.

[8] Vipul Arora,et al. On-Line Melody Extraction From Polyphonic Audio Using Harmonic Cluster Tracking , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[9] José Javier Valero Mas. MEASURING SIMILARITY OF AUTOMATICALLY EXTRACTED MELODIC PITCH CONTOURS FOR AUDIO-BASED QUERY BY HUMMING OF POLYPHONIC MUSIC COLLECTIONS , 2013 .

[10] Nadine Kroher,et al. The Flamenco Cante : Automatic Characterization of Flamenco Singing by Analyzing Audio Recordings , 2013 .

[11] Bryan Pardo,et al. REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[12] Xavier Serra,et al. Score Informed Tonic Identification for Makam Music of Turkey , 2013, ISMIR.

[13] Emilia Gómez,et al. Semantic audio content-based music recommendation and visualization based on user preference examples , 2013, Inf. Process. Manag..

[14] Xavier Serra,et al. Essentia: An Audio Analysis Library for Music Information Retrieval , 2013, ISMIR.

[15] Emilia Gómez,et al. Tonal representations for music retrieval: from version identification to query-by-humming , 2012, International Journal of Multimedia Information Retrieval.

[16] Kai Lu,et al. Query Reformulation Based on User Habits for Query-by-Humming Systems , 2012, AIRS.

[17] Xavier Serra,et al. Rāga Recognition based on Pitch Distribution Methods , 2012 .

[18] Bryan Pardo,et al. Music/Voice Separation Using the Similarity Matrix , 2012, ISMIR.

[19] Axel Röbel,et al. Statistical Characterisation of Melodic Pitch Contours and its Application for Melody Extraction , 2012, ISMIR.

[20] Alvaro Pardo,et al. Separation and Classification of Harmonic Sounds for Singing Voice Detection , 2012, CIARP.

[21] Fabian J. Theis,et al. The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges , 2012, Signal Process..

[22] Emilia Gómez,et al. Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[23] Dimitrios Gunopulos,et al. A survey of query-by-humming similarity methods , 2012, PETRA '12.

[24] Jyh-Shing Roger Jang,et al. A hybrid approach to singing pitch extraction based on trend estimation and hidden Markov models , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25] Antoine Liutkus,et al. Adaptive filtering for music/voice separation exploiting the repeating musical structure , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26] Antoine Liutkus,et al. Probabilistic model for main melody extraction using Constant-Q transform , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27] Paris Smaragdis,et al. Singing-voice separation from monaural recordings using robust principal component analysis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28] Emilia Gómez,et al. Musical genre classification using melody features extracted from polyphonic music signals , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29] Peter Grosche,et al. Toward characteristic audio shingles for efficient cross-version music retrieval , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30] Xavier Serra,et al. Predictability of Music Descriptor Time Series and its Application to Cover Song Detection , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[31] Ferdinand Fuhrmann. Automatic musical instrument recognition from polyphonic music audio signals , 2012 .

[32] Sankalp Gulati. A Tonic Identification Approach for Indian Art Music , 2012 .

[33] Sergio Oramas,et al. Automatic Detection of Melodic Patterns in Flamenco Singing by Analyzing Polyphonic Music Recordings , 2012 .

[34] Bob L. Sturm,et al. On Automatic Music Genre Recognition by Sparse Representation Classification using Auditory Temporal Modulations , 2012, CMMR 2012.

[35] Julián Urbano,et al. Current Challenges in the Evaluation of Predominant Melody Extraction Algorithms , 2012, ISMIR.

[36] José Miguel Díaz-Báñez,et al. Tracking Melodic Patterns in Flamenco Singing by Analyzing Polyphonic Music Recordings , 2012, ISMIR.

[37] Joe Cheri Ross,et al. Detecting Melodic Motifs from Audio for Hindustani Classical Music , 2012, ISMIR.

[38] Emilia Gómez,et al. Melodic Transcription of Flamenco Singing from Monophonic and Polyphonic Music Recordings , 2012 .

[39] Xavier Serra,et al. A Multipitch Approach to Tonic Identification in Indian Classical Music , 2012, ISMIR.

[40] Xavier Serra,et al. Characterization of Intonation in Carnatic Music by Parametrizing Pitch Histograms , 2012, ISMIR.

[41] Jordi Bonada,et al. Predominant Fundamental Frequency Estimation vs Singing Voice Separation for the Automatic Transcription of Accompanied Flamenco Singing , 2012, ISMIR.

[42] Markus Schedl,et al. Minimal test collections for low-cost evaluation of Audio Music Similarity and Retrieval systems , 2012, International Journal of Multimedia Information Retrieval.

[43] S. Arthi,et al. Carnatic music analysis: Shadja, swara identification and rAga verification in AlApana using stochastic models , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[44] Hsin-Min Wang,et al. An Acoustic-Phonetic Approach to Vocal Melody Extraction , 2011, ISMIR.

[45] C. Laurier. Automatic Classification of musical mood by content-based analysis , 2011 .

[46] Hanspeter Herzel,et al. Analysing and Understanding the Singing Voice: Recent Progress and Open Questions , 2011 .

[47] Karin Dressler,et al. Pitch Estimation by the Pair-Wise Evaluation of Spectral Peaks , 2011, Semantic Audio.

[48] Gaël Richard,et al. A Musically Motivated Mid-Level Representation for Pitch Estimation and Musical Audio Source Separation , 2011, IEEE Journal of Selected Topics in Signal Processing.

[49] DeLiang Wang,et al. A trend estimation algorithm for singing pitch detection in musical recordings , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50] J. Kreiman,et al. Miscellany: Voice in Law Enforcement, Media and Singing , 2011 .

[51] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.

[52] Julián Urbano. Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Music Domain , 2011, ISMIR.

[53] Rafael Hoces,et al. La transcripción musical para guitarra flamenca: Análisis e implementación metodológica , 2011 .

[54] L. Marín. LA BIMODALIDAD EN LAS FORMAS DEL FANDANGO Y EN LOS CANTES DE LEVANTE: ORIGEN Y EVOLUCIÓN , 2011 .

[55] Karin Dressler,et al. An Auditory Streaming Approach for Melody Extraction from Polyphonic Music , 2011, ISMIR.

[56] Emilia Gómez,et al. Supplementary Graphs: Sinusoid Extraction and Salience Function Design for Predominant Melody Estimation , 2011 .

[57] Emilia Gómez,et al. MELODY EXTRACTION FROM POLYPHONIC MUSIC: MIREX 2011 , 2011 .

[58] Seok-Pil Lee,et al. Extracting Predominant Melody of Polyphonic Music based on Harmonic Structure , 2011 .

[59] Joan Serrà,et al. Identification of versions of the same musical composition by processing audio descriptions , 2011 .

[60] Automatic Detection of Ornamentation in Flamenco , 2011 .

[61] Ge Wang,et al. Musical Influence Network Analysis and Rank of Sample-Based Music , 2011, ISMIR.

[62] Xavier Serra. A Multicultural Approach in Music Information Research , 2011, ISMIR.

[63] Preeti Rao,et al. Vocal Melody Extraction in the Presence of Pitched Accompaniment in Polyphonic Music , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[64] Jean-Louis Durrieu,et al. Automatic transcription and separation of the main melody in polyphonic music signals , 2010 .

[65] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[66] Shigeki Sagayama,et al. Melody line estimation in homophonic music audio signals based on temporal-variability of melodic source , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[67] Mathieu Lagrange,et al. Multimodal similarity between musical streams for cover version detection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[68] Daniel P. W. Ellis,et al. Cover song detection: From high scores to general classification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[69] Geoffroy Peeters,et al. Partial clustering using a time-varying frequency model for singing voice detection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[70] Pierre Comon,et al. Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[71] Axel Röbel,et al. Dynamic Spectral Envelope Modeling for Timbre Analysis of Musical Instrument Sounds , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[72] Gaël Richard,et al. Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[73] Meinard Müller,et al. Towards Timbre-Invariant Audio Features for Harmony-Based Music , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[74] Jyh-Shing Roger Jang,et al. On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[75] Emilia Gómez,et al. Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond , 2010, Advances in Music Information Retrieval.

[76] Hirokazu Kameoka,et al. Harmonic and Percussive Sound Separation and Its Application to MIR-Related Tasks , 2010, Advances in Music Information Retrieval.

[77] Enric Guaus i Termens. Audio content processing for automatic music genre classification: descriptors, databases, and classifiers , 2010 .

[78] Seokhwan Jo,et al. Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model , 2010, INTERSPEECH.

[79] Jyh-Shing Roger Jang,et al. Singing Pitch Extraction by Voice Vibrato / Tremolo Estimation and Instrument Partial Deletion , 2010, ISMIR.

[80] José Miguel Díaz-Báñez,et al. Characterization and Similarity in A Cappella Flamenco Cantes , 2010, ISMIR.

[81] Emiru Tsunoo,et al. Autoregressive MFCC Models for Genre Classification Improved by Harmonic-percussion Separation , 2010, ISMIR.

[82] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[83] Rui Pedro Paiva,et al. Melody Detection in Polyphonic Audio , 2009 .

[84] Evangelos Kanoulas,et al. Empirical justification of the gain and discount function for nDCG , 2009, CIKM.

[85] Kien A. Hua,et al. Transfer non-metric measures into metric for similarity search , 2009, MM '09.

[86] R. Andrzejak,et al. Cross recurrence quantification for cover song identification , 2009 .

[87] Pierre Hanna,et al. Query by tapping system based on alignment algorithm , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[88] Geoffroy Peeters,et al. Singing voice detection in music tracks using direct voice vibrato detection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[89] Gaël Richard,et al. An iterative approach to monaural musical mixture de-soloing , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[90] Karin Dressler. AUDIO MELODY EXTRACTION FOR MIREX 2009 , 2009 .

[91] Preeti Rao,et al. Singing voice detection in polyphonic music using predominant pitch , 2009, INTERSPEECH.

[92] Justin Salamon,et al. A Chroma-based Salience Function for Melody and Bass Line Estimation from Music Audio Signals , 2009 .

[93] Anssi Klapuri. A Method for Visualizing the Pitch Content of Polyphonic Music Signals , 2009, ISMIR.

[94] Justin Salamon,et al. A Quantitative Evaluation of a Two Stage Retrieval Approach for a Melodic Query by Example System , 2009, ISMIR.

[95] Alan Hanjalic,et al. Cover Song Retrieval: A Comparative Study of System Component Choices , 2009, ISMIR.

[96] Frans Wiering,et al. Robust Segmentation and Annotation of Folk Song Recordings , 2009, ISMIR.

[97] Ernesto López,et al. An Efficient Multi-Resolution Spectral Transform for Music Analysis , 2009, ISMIR.

[98] Constantine Kotropoulos,et al. Music Genre Classification Using Locality Preserving Non-Negative Tensor Factorization and Sparse Representations , 2009, ISMIR.

[99] Chang D. Yoo,et al. MELODY EXTRACTION FROM POLYPHONIC AUDIO SIGNAL MIREX 2009 , 2009 .

[100] Matti Ryynänen,et al. Automatic Transcription of Pitch Content in Music and Selected Applications , 2008 .

[101] Bryan Pardo,et al. Speeding Melody Search With Vantage Point Trees , 2008, ISMIR.

[102] Matija Marolt,et al. A Mid-Level Representation for Melody-Based Retrieval in Audio Collections , 2008, IEEE Transactions on Multimedia.

[103] Xavier Serra,et al. Statistical Analysis of Chroma Features in Western Music Predicts Human Judgments of Tonality , 2008 .

[104] Hsin-Min Wang,et al. Using the Similarity of Main Melodies to Identify Cover Versions of Popular Songs for Music Document Retrieval , 2008, J. Inf. Sci. Eng..

[105] Anssi Klapuri,et al. Automatic Transcription of Melody, Bass Line, and Chords in Polyphonic Music , 2008, Computer Music Journal.

[106] J. Stephen Downie,et al. The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008 .

[107] Malcolm Slaney,et al. Analysis of Minimum Distances in High-Dimensional Musical Spaces , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[108] Rui Jiang,et al. The vocalsearch music search engine , 2008, JCDL '08.

[109] Anssi Klapuri,et al. Query by humming of midi and audio using locality sensitive hashing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[110] David Bodoff. Test theory for evaluating reliability of IR test collections , 2008, Inf. Process. Manag..

[111] Fabien Gouyon. Computational Rhythm Description , 2008 .

[112] Marc Leman,et al. Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[113] Meinard Müller,et al. Efficient Index-Based Audio Matching , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[114] Mathieu Lagrange,et al. Normalized Cuts for Predominant Melodic Source Separation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[115] D. Wang,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006, IEEE Trans. Neural Networks.

[116] Xavier Serra,et al. Content Processing of Music Audio Signals , 2008 .

[117] Rainer Typke,et al. A Tunneling-Vantage Indexing Method for Non-Metrics , 2008, ISMIR.

[118] Emilia Gómez,et al. Comparative Melodic Analysis of A Cappella Flamenco Cantes , 2008 .

[119] Jordi Bonada. WIDE-BAND HARMONIC SINUSOIDAL MODELING , 2008 .

[120] Antonio Camurri,et al. Sound and Music Computing: Challenges and Strategies , 2007 .

[121] Rémi Gribonval,et al. Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[122] Graham E. Poliner,et al. Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[123] DeLiang Wang,et al. Separation of Singing Voice From Music Accompaniment for Monaural Recordings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[124] Pedro Cano,et al. Content-based audio search: from fingerprinting to semantic audio retrieval , 2007 .

[125] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[126] Ning Hu,et al. A comparative evaluation of search techniques for query-by-humming using the MUSART testbed , 2007, J. Assoc. Inf. Sci. Technol..

[127] Stephen Harrington,et al. Perception and Detection of Auditory Offsets With Single Simple Musical Stimuli in a Reverberant Environment , 2007 .

[128] Rainer Typke,et al. Music Retrieval based on Melodic Similarity , 2007 .

[129] Anssi Klapuri,et al. Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods , 2007, ISMIR.

[130] R. Sun. Introduction to Computational Cognitive Modeling , 2007 .

[131] Andreas Nürnberger,et al. Towards Query by Singing/Humming on Audio Databases , 2007, ISMIR.

[132] Perfecto Herrera,et al. Comparing audio descriptors for singing voice detection in music audio files , 2007 .

[133] Eric D. Scheirer,et al. Bregman's Chimerae: Music Perception as Auditory Scene Analysis , 2007 .

[134] Amílcar Cardoso,et al. Melody Detection in Polyphonic Musical Signals: Exploiting Perceptual Rules, Note Salience, and Melodic Smoothness , 2006, Computer Music Journal.

[135] Rémi Gribonval,et al. Audio source separation with a single sensor , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[136] Udo Zölzer,et al. Adaptive digital audio effects (a-DAFx): a new class of sound transformations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[137] Emilia Gómez Gutiérrez,et al. Tonal description of music audio signals , 2006 .

[138] Emilia Gómez,et al. Tonal Description of Polyphonic Audio for Music Content Processing , 2006, INFORMS J. Comput..

[139] Anssi Klapuri,et al. Signal Processing Methods for Music Transcription , 2006 .

[140] N. Scaringella,et al. Automatic genre classification of music content: a survey , 2006, IEEE Signal Process. Mag..

[141] Guy J. Brown,et al. Analysis of Musical Audio Signals , 2006 .

[142] Nicola Orio,et al. Music Retrieval: A Tutorial and Review , 2006, Found. Trends Inf. Retr..

[143] Anssi Klapuri,et al. Multiple Fundamental Frequency Estimation by Summing Harmonic Amplitudes , 2006, ISMIR.

[144] Tuomas Virtanen,et al. Sound Source Separation in Monaural Music Signals , 2006 .

[145] Petri Toiviainen,et al. Visualization in comparative music research , 2006 .

[146] Daniel P. W. Ellis,et al. A Quantitative Comparison of Different Approaches for Melody Extraction from Polyphonic Audio Recordings , 2006 .

[147] Emilia Gómez,et al. Automatic Extraction of Musical Structure Using Pitch Class Distribution Features , 2006 .

[148] Karin Dressler,et al. SINUSOIDAL EXTRACTION USING AN EFFICIENT IMPLEMENTATION OF A MULTI-RESOLUTION FFT , 2006 .

[149] James Kalbach. Understanding information systems: What they do and why we need them , 2005, J. Assoc. Inf. Sci. Technol..

[150] David Talkin,et al. A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .

[151] François Pachet,et al. Knowledge Management and Musical Metadata , 2005 .

[152] Daniel P. W. Ellis,et al. A Classification Approach to Melody Transcription , 2005, ISMIR.

[153] Meinard Müller,et al. Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[154] Shankar Vembu,et al. Separation of Vocals from Polyphonic Audio Recordings , 2005, ISMIR.

[155] Alain de Cheveigné,et al. Pitch perception models , 2005 .

[156] Larry Wasserman,et al. All of Statistics: A Concise Course in Statistical Inference , 2004 .

[157] Masataka Goto,et al. A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[158] Daniel P. W. Ellis,et al. A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[159] Amílcar Cardoso,et al. A methodology for detection of melody in polyphonic music signals , 2004 .

[160] Emmanuel Vincent,et al. Modèles d'instruments pour la séparation de sources et la transcription d'enregistrements musicaux. (Instrument models for source separation and transcription of music recordings) , 2004 .

[161] M. Marolt. ON FINDING MELODIC LINES IN AUDIO RECORDINGS , 2004 .

[162] Emilia Gómez,et al. Estimating The Tonality Of Polyphonic Audio Files: Cognitive Versus Machine Learning Modelling Strategies , 2004, ISMIR.

[163] Ichiro Fujinaga,et al. Automatic Genre Classification Using Large High-Level Musical Feature Sets , 2004, ISMIR.

[164] Marc Leman,et al. Methodological Considerations Concerning Manual Annotation Of Musical Audio In Function Of Algorithm Development , 2004, ISMIR.

[165] Anssi Klapuri,et al. Signal Processing Methods for the Automatic Transcription of Music , 2004 .

[166] Anssi Klapuri,et al. Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[167] Marc Leman,et al. An auditory model based transriber of vocal queries , 2003, ISMIR.

[168] G. Peeters. Automatic Classification of Large Musical Instrument Databases Using Hierarchical Classifiers with Inertia Ratio Maximization , 2003 .

[169] Nigel O'Brian,et al. Generalizability Theory I , 2003 .

[170] Anssi Klapuri,et al. Melody Description and Extraction in the Context of Music Content Processing , 2003 .

[171] Juan Pablo,et al. Towards the automated analysis of simple polyphonic music : a knowledge-based approach , 2003 .

[172] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[173] Mark A. Hall,et al. Correlation-based Feature Selection for Machine Learning , 2003 .

[174] William P. Birmingham,et al. Query by Humming: How good can it get? , 2003, SIGIR 2003.

[175] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[176] Florian Keiler,et al. SURVEY ON EXTRACTION OF SINUSOIDS IN STATIONARY SOUNDS , 2002 .

[177] Mark D. Plumbley,et al. Automatic Music Transcription and Audio Source Separation , 2002, Cybern. Syst..

[178] Hideki Kawahara,et al. YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[179] Kyoungro Yoon,et al. Mid-Level Music Melody Representation of Polyphonic Audio for Query-by-Humming System , 2002, ISMIR.

[180] Earl Vickers. Automatic Long-term Loudness and Dynamics Matching , 2001 .

[181] J Sundberg,et al. Describing different styles of singing: A comparison of a female singer's voice source in ''Classical'', ''Pop'', ''Jazz'' and ''Blues'' , 2001, Logopedics, phoniatrics, vocology.

[182] Albert S. Bregman,et al. Auditory Scene Analysis , 2001 .

[183] C. Chuan. Tone and Voice: A Derivation of the Rules of Voice-Leading from Perceptual Principles , 2001 .

[184] Anssi Klapuri,et al. Qualitative and quantitative aspects in the design of periodicity estimation algorithms , 2000, 2000 10th European Signal Processing Conference.

[185] Masataka Goto,et al. A robust predominant-F0 estimation method for real-time detection of melody and bass lines in CD recordings , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[186] Barry Vercoe,et al. Music-listening systems , 2000 .

[187] Xavier Serra,et al. Towards Instrument Segmentation for Music Content Description: a Critical Review of Instrument Classification Techniques , 2000, ISMIR.

[188] Eric D. Scheirer,et al. Towards music understanding without separation: segmenting music with correlogram comodulation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[189] Z. Meral Özsoyoglu,et al. Indexing large metric spaces for similarity search queries , 1999, TODS.

[190] Masataka Goto,et al. A Real-time Music Scene Description System: Detecting Melody and Bass Lines in Audio Signals , 1999 .

[191] Philippe Donnier. Flamenco: elementos para la transcripción del cante y de la guitarra , 1998 .

[192] Eleanor Selfridge-Field,et al. Conceptual and representational issues in melodic comparison , 1998 .

[193] Jordi Bonada,et al. Vibrato Extraction and Parameterization in the Spectral Modeling Synthesis framework , 1998 .

[194] W M Hartmann,et al. Pitch, periodicity, and auditory organization. , 1996, The Journal of the Acoustical Society of America.

[195] Daniel Patrick Whittlesey Ellis,et al. Prediction-driven computational auditory scene analysis , 1996 .

[196] P. Iverson,et al. Auditory stream segregation by musical timbre: effects of static and dynamic acoustic attributes. , 1995, Journal of experimental psychology. Human perception and performance.

[197] J. A. Stewart,et al. Nonlinear Time Series Analysis , 2015 .

[198] Albert S. Bregman. Constraints on computational models of auditory scene analysis, as derived from human perception , 1995 .

[199] A S Bregman,et al. Resetting the pitch-analysis system. 2. Role of sudden onsets and offsets in the perception of individual components in a cluster of overlapping tones. , 1994, The Journal of the Acoustical Society of America.

[200] J. Beauchamp,et al. Fundamental frequency estimation of musical signals using a two‐way mismatch procedure , 1994 .

[201] G. Mckay. Harmony , 1955, Journalen sykepleien.

[202] Eyal Yair,et al. Super resolution pitch determination of speech signals , 1991, IEEE Trans. Signal Process..

[203] Judith C. Brown. Calculation of a constant Q spectral transform , 1991 .

[204] D. Oller,et al. Innateness, Experience, and Music Perception , 1990 .

[205] D. J. Hermes,et al. Measurement of pitch by subharmonic summation. , 1988, The Journal of the Acoustical Society of America.

[206] David A. Krubsack,et al. A spectral autocorrelation method for measurement of the fundamental frequency of noise-corrupted speech , 1987, IEEE Trans. Acoust. Speech Signal Process..

[207] J. Sundberg,et al. The Science of Singing Voice , 1987 .

[208] Stephen Travis Pope,et al. The Development of an Intelligent Composer's Assistant: Interactive Graphics Tools and Knowledge Representation for Music , 1986, ICMC.

[209] Wolfgang Hess,et al. Pitch Determination of Speech Signals: Algorithms and Devices , 1983 .

[210] E. Terhardt,et al. Algorithm for extraction of pitch and pitch salience from complex tonal signals , 1982 .

[211] S. A. K. Durga,et al. Music of India--A Scientific Study , 1981 .

[212] B. Galler,et al. Predicting musical pitch from component frequency ratios , 1979 .

[213] Ernst Terhardt,et al. Calculating virtual pitch , 1979, Hearing Research.

[214] E. Owens,et al. An Introduction to the Psychology of Hearing , 1997 .

[215] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[216] D. Harwood. Universals in Music: A Perspective from Cognitive Psychology , 1976 .

[217] Charles R. Adams. Melodic Contour Typology , 1976 .

[218] E. Terhardt. Pitch, consonance, and harmony. , 1974, The Journal of the Acoustical Society of America.

[219] Lothar Klein,et al. Tonality , 1969 .

[220] B Gold,et al. Parallel processing techniques for estimating pitch periods of speech in the time domain. , 1969, The Journal of the Acoustical Society of America.

[221] Alain Daniélou. The Ragas of Northern Indian music , 1968 .

[222] A. Noll. Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[223] M. Kassler. Toward Musical Information Retrieval , 1966 .

[224] D. Cox,et al. An Analysis of Transformations , 1964 .

[225] E. Zwicker,et al. Subdivision of the audible frequency range into critical bands , 1961 .

[226] D. W. Robinson,et al. A re-determination of the equal-loudness relations for pure tones , 1956 .