论文信息 - An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics

An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics

With the proliferation of digital audio distribution over digital media, audio content analysis is fast becoming a requirement for designers of intelligent signal-adaptive audio processing systems. Written by a well-known expert in the field, this book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike. A review of relevant fundamentals in audio signal processing, psychoacoustics, and music theory, as well as downloadable MATLAB files are also included. Please visit the companion website: www.AudioContentAnalysis.org

Alexander Lerch | Alexander Lerch

[1] C. Krumhansl,et al. Isolating the dynamic attributes of musical timbre. , 1993, The Journal of the Acoustical Society of America.

[2] Eric Allamanche,et al. MPEG-7 Scalable Robust Audio Fingerprinting , 2002 .

[3] Robert O. Gjerdingen,et al. The psychology of music , 2002 .

[4] S. Lakatos. A common perceptual space for harmonic and percussive timbres , 2000, Perception & psychophysics.

[5] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[6] Malcolm Slaney,et al. Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7] S. Dixon,et al. Analysis of tempo classes in performances of Mozart sonatas , 2001 .

[8] Emery Schubert. Modeling Perceived Emotion With Continuous Musical Features , 2004 .

[9] J. Miles,et al. Applying regression & correlation : a guide for students and researchers , 2001 .

[10] Anssi Klapuri,et al. Musical Instrument Recognition in Polyphonic Audio Using Source-Filter Model for Sound Separation , 2009, ISMIR.

[11] Gerhard Widmer,et al. Classification of dance music by periodicity patterns , 2003, ISMIR.

[12] Petri Toiviainen,et al. Autocorrelation in meter induction: the role of accent structure. , 2006, The Journal of the Acoustical Society of America.

[13] H. P. Weld,et al. An Experimental Study of Musical Enjoyment , 1912 .

[14] Christophe d'Alessandro,et al. An iterative algorithm for decomposition of speech signals into periodic and aperiodic components , 1998, IEEE Trans. Speech Audio Process..

[15] Peter Desain,et al. A structurally guided method for the decomposition of expression in music performance. , 2006, The Journal of the Acoustical Society of America.

[16] Jonathan Foote,et al. Content-based retrieval of music and audio , 1997, Other Conferences.

[17] John Z. Zhang,et al. Camel: a lightweight framework for content-based audio and music analysis , 2010, Audio Mostly Conference.

[18] Gaël Richard,et al. Musical instrument recognition by pairwise classification strategies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[19] Nick Collins,et al. SCMIR: A SuperCollider Music Information Retrieval Library , 2011, ICMC.

[20] A. Klapuri,et al. MUSICAL METER ESTIMATION AND MUSIC TRANSCRIPTION , 2000 .

[21] J. Wolfe,et al. Spectral centroid and timbre in complex, multiple instrumental textures , 2004 .

[22] Gianpaolo Evangelista,et al. Time and Frequency Warping Musical Signals , 2011 .

[23] David García,et al. CLAM: an OO framework for developing audio and music applications , 2002, OOPSLA '02.

[24] Roger B. Dannenberg,et al. An On-Line Algorithm for Real-Time Accompaniment , 1984, ICMC.

[25] Seungjae Lee,et al. Audio fingerprinting based on normalized spectral subband centroids , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[26] Malcolm D. Macleod,et al. Onset Detection in Musical Audio Signals , 2003, ICMC.

[27] Alexander Lerch,et al. Evaluation of Features for Audio-to-Audio Alignment , 2011 .

[28] Gerhard Widmer,et al. MATCH: A Music Alignment Tool Chest , 2005, ISMIR.

[29] V. A. Kotel’nikov. CONFERENCES AND SYMPOSIA: On the transmission capacity of 'ether' and wire in electric communications , 2006 .

[30] Malcolm Slaney,et al. Automatic chord recognition from audio using a supervised HMM trained with audio-from-symbolic data , 2006, AMCMM '06.

[31] K. Scherer. Which Emotions Can be Induced by Music? What Are the Underlying Mechanisms? And How Can We Measure Them? , 2004 .

[32] J. Sloboda. The Musical Mind: The Cognitive Psychology of Music , 1987 .

[33] Alicja Wieczorkowska,et al. Music Information Retrieval , 2009, Encyclopedia of Data Warehousing and Mining.

[34] H. Nyquist,et al. Certain Topics in Telegraph Transmission Theory , 1928, Transactions of the American Institute of Electrical Engineers.

[35] Gabriel Gatzsche,et al. Interaction with Tonal Pitch Spaces , 2008, NIME.

[36] Thomas Fillon,et al. YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software , 2010, ISMIR.

[37] Anssi Klapuri,et al. Musical instrument recognition using cepstral coefficients and temporal features , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[38] Bridget Baird,et al. The artificially intelligent computer performer: The second generation , 1990 .

[39] G. Soete,et al. Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes , 1995, Psychological research.

[40] Fabien Gouyon,et al. A beat induction method for musical audio signals , 2003 .

[41] Guodong Guo,et al. Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[42] George Tzanetakis,et al. Pitch Histograms in Audio and Symbolic Music Information Retrieval , 2003, ISMIR.

[43] Xavier Serra,et al. Evaluation in Music Information Retrieval , 2013, Journal of Intelligent Information Systems.

[44] Fabian Mörchen,et al. Modeling timbre distance with temporal statistics from polyphonic music , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[45] B H Repp,et al. The dynamics of expressive piano performance: Schumann's "Träumerei" revisited. , 1996, Journal of the Acoustical Society of America.

[46] B. Repp. A microcosm of musical expression. I. Quantitative analysis of pianists' timing in the initial measures of Chopin's Etude in E major. , 1998, The Journal of the Acoustical Society of America.

[47] Elias Pampalk,et al. Content-based organization and visualization of music archives , 2002, MULTIMEDIA '02.

[48] Pedro Cano,et al. A Review of Audio Fingerprinting , 2005, J. VLSI Signal Process..

[49] Anssi Klapuri,et al. Multipitch estimation and sound separation by the spectral smoothness principle , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[50] Patrik N. Juslin. Studies of music performance: A theoretical analysis of empirical findings , 2003 .

[51] Roger B. Dannenberg,et al. Enhanced Vocal Performance Tracking Using Multiple Information Sources , 1998, ICMC.

[52] Bret Aarden,et al. How the Timing Between Notes Can Impact Musical Meaning , 2006 .

[53] Ian Kaminskyj,et al. Automatic source identification of monophonic musical instrument sounds , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[54] Eliezer Rapoport. The Marvels of the Human Voice : Poem-Melody-Vocal Performance , 2007 .

[55] Juan Pablo Bello,et al. A Robust Mid-Level Representation for Harmonic Content in Music Signals , 2005, ISMIR.

[56] Xavier Serra,et al. Towards Instrument Segmentation for Music Content Description: a Critical Review of Instrument Classification Techniques , 2000, ISMIR.

[57] William G. Gardner,et al. Efficient Convolution without Input/Output Delay , 1995 .

[58] Juan Pablo Bello,et al. Automated Music Emotion Recognition: A Systematic Evaluation , 2010 .

[59] Cheng Yang. MACS: music audio characteristic sequence indexing for similarity retrieval , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[60] Renee Timmers,et al. Predicting the similarity between expressive performances of music from measurements of tempo and dynamics. , 2005, The Journal of the Acoustical Society of America.

[61] Christian Uhle,et al. ESTIMATION OF TEMPO, MICRO TIME AND TIME SIGNATURE FROM PERCUSSIVE MUSIC , 2003 .

[62] Martin F. McKinney,et al. Perceptual evaluation of music similarity , 2006, ISMIR.

[63] Guy J. Brown,et al. Application of missing feature theory to the recognition of musical instruments in polyphonic audio , 2003, ISMIR.

[64] S. McAdams,et al. Acoustic correlates of timbre space dimensions: a confirmatory study using synthetic tones. , 2005, The Journal of the Acoustical Society of America.

[65] H. Honing,et al. On music performance, theories, measurement en diversity , 2002 .

[66] France,et al. Onset Detection in Polyphonic Signals by means of Transient Peak Classification , 2005 .

[67] S. S. Stevens,et al. Hearing, Its Psychology And Physiology , 1983 .

[68] Xavier Rodet,et al. MUSICAL INSTRUMENT IDENTIFICATION IN CONTINUOUS RECORDINGS , 2004 .

[69] Stan Z. Li,et al. Content-based audio classification and retrieval using the nearest feature line method , 2000, IEEE Trans. Speech Audio Process..

[70] Arthur Flexer,et al. A Closer Look on Artist Filters for Musical Genre Classification , 2007, ISMIR.

[71] E. B. Newman,et al. A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .

[72] Judith C. Brown,et al. An efficient algorithm for the calculation of a constant Q transform , 1992 .

[73] N. Wiener. The Wiener RMS (Root Mean Square) Error Criterion in Filter Design and Prediction , 1949 .

[74] Helmut Neuschmied,et al. Robust Sound Modeling for Song Detection in Broadcast Audio , 2002 .

[75] T Nakamura,et al. The communication of dynamics between musicians and listeners through musical performance. , 1989, Perception & psychophysics.

[76] Roger B. Dannenberg,et al. Real-Time Computer Accompaniment of Keyboard Performances , 1985, ICMC.

[77] John M. Geringer. Continuous Loudness Judgments of Dynamics in Recorded Music Excerpts , 1995 .

[78] J. Reiss,et al. ONSET DETECTION COMBINING ENERGY-BASED AND PITCH-BASED APPROACHES , 2007 .

[79] B H Repp,et al. Detecting deviations from metronomic timing in music: Effects of perceptual structure on the mental timekeeper , 1999, Perception & psychophysics.

[80] Edwaed W. Large. Dynamic programming for the analysis of serial behaviors , 1993 .

[81] L. Varga,et al. Short-term sound stream characterization for reliable, real-time occurrence monitoring of given sound-prints , 2000, 2000 10th Mediterranean Electrotechnical Conference. Information Technology and Electrotechnology for the Mediterranean Countries. Proceedings. MeleCon 2000 (Cat. No.00CH37099).

[82] Barry Vercoe,et al. The Synthetic Performer in The Context of Live Performance , 1984, International Conference on Mathematics and Computing.

[83] Avery Wang,et al. An Industrial Strength Audio Search Algorithm , 2003, ISMIR.

[84] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[85] Emmanuel Bigand,et al. Seven problems that keep MIR from attracting the interest of cognition and neuroscience , 2013, Journal of Intelligent Information Systems.

[86] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[87] Ichiro Fujinaga,et al. jMIR and ACE XML: Tools for Performing and Sharing Research in Automatic Music Classification , 2009 .

[88] Richard F. Lyon,et al. History and future of auditory filter models , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[89] A. Lehmann,et al. Tracking Performance Correlates of Changes in Perceived Intensity of Emotion During Different Interpretations of a Chopin Piano Prelude , 2001 .

[90] Emmanuel Bigand. Abstraction of Two Forms of Underlying Structure in a Tonal Melody , 1990 .

[91] Sofia Dahl. The Playing of an Accent – Preliminary Observations from Temporal and Kinematic Analysis of Percussionists* , 2000 .

[92] Anssi Klapuri,et al. Robust Multipitch Estimation for the Analysis and Manipulation of Polyphonic Musical Signals , 2000 .

[93] Eric Clarke,et al. Expression and communication in musical performance , 1991 .

[94] Ichiro Fujinaga,et al. jMIR: Tools for Automatic Music Classification , 2009, ICMC.

[95] J. G. Lourens. Detection and Logging Advertisements using its Sound , 1990, IEEE South African Symposium on Communications and Signal Processing.

[96] H. Traunmüller. Analytical expressions for the tonotopic sensory scale , 1990 .

[97] Rafael Ramírez,et al. Performer Identification in Celtic Violin Recordings , 2008, ISMIR.

[98] D. Moelants,et al. Deviations from the resonance theory of tempo induction , 2004 .

[99] George Tzanetakis,et al. MARSYAS-0.2: A Case Study in Implementing Music Information Retrieval Systems , 2008 .

[100] Roberto Dillon. Extracting audio cues in real time to understand musical expressiveness , 2001 .

[101] N. Fletcher. Acoustical correlates of flute performance technique , 1975 .

[102] Mark B. Sandler,et al. Phase-based note onset detection for music signals , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[103] Bridget Baird,et al. Artificial Intelligence and Music: Implementing an Interactive Computer Performer , 1993 .

[104] Anssi Klapuri,et al. Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[105] Renee Timmers. Context-sensitive evaluation of expression , 2001 .

[106] B. Moore. An Introduction to the Psychology of Hearing , 1977 .

[107] Caroline Palmer,et al. Tactile feedback and timing accuracy in piano performance , 2008, Experimental Brain Research.

[108] J C Brown,et al. Feature dependence in the automatic identification of musical woodwind instruments. , 2001, The Journal of the Acoustical Society of America.

[109] B. Repp. Diversity and commonality in music performance: an analysis of timing microstructure in Schumann's "Träumerei". , 1992, The Journal of the Acoustical Society of America.

[110] C. Palmer,et al. Auditory feedback and memory for music performance: Sound evidence for an encoding effect , 2003, Memory & cognition.

[111] Jonathan Foote,et al. Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[112] Jean-Pierre Martens,et al. A novel chroma representation of polyphonic music based on multiple pitch tracking techniques , 2008, ACM Multimedia.

[113] W. Jay Dowling,et al. Scale structure and similarity of melodies. , 1988 .

[114] Ching-Hua Chuan,et al. Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[115] Cesar Pedraza,et al. Fast parallel audio fingerprinting implementation in reconfigurable hardware and GPUs , 2011, 2011 VII Southern Conference on Programmable Logic (SPL).

[116] Gerhard Widmer,et al. Automatic Recognition of Famous Artists by Machine , 2004, ECAI.

[117] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[118] Luke Windsor,et al. The Timing of Grace Notes in Skilled Musical Performance at Different Tempi: A Preliminary Case Study , 2001 .

[119] Douglas Keislar,et al. Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[120] Kuldip K. Paliwal,et al. A Comparative Study of Filter Bank Spacing for Speech Recognition , 2003 .

[121] J. Jośe. A HIERARCHICAL APPROACH TO AUTOMATIC MUSICAL GENRE CLASSIFICATION , 2003 .

[122] Beth Logan,et al. A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[123] T. Eerola,et al. Statistical Features and Perceived Similarity of Folk Melodies , 2001 .

[124] Nicola Orio,et al. Alignment of Monophonic and Polyphonic Music to a Score , 2001, ICMC.

[125] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .

[126] B H Repp. The effect of tempo on pedal timing in piano performance , 1997, Psychological research.

[127] Hirokazu Kameoka,et al. A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[128] Jean Laroche,et al. Efficient Tempo and Beat Tracking in Audio Recordings , 2003 .

[129] S. Dixon. ONSET DETECTION REVISITED , 2006 .

[130] Daniel P. W. Ellis,et al. Ground-truth transcriptions of real music from force-aligned MIDI syntheses , 2003, ISMIR.

[131] Anssi Klapuri,et al. Auditory-Model Based Methods for Multiple Fundamental Frequency Estimation , 2006 .

[132] Hideki Kawahara,et al. Multiple period estimation and pitch perception model , 1999, Speech Commun..

[133] Yi-Hsuan Yang,et al. A Regression Approach to Music Emotion Recognition , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[134] Chang Dong Yoo,et al. Boosted Binary Audio Fingerprint Based on Spectral Subband Moments , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[135] Yoichi Muraoka,et al. Beat Tracking based on Multiple-agent Architecture A Real-time Beat Tracking System for Audio Signals w , 1996 .

[136] Yoichi Muraoka,et al. Real-time beat tracking for drumless audio signals: Chord change detection for musical decisions , 1999, Speech Commun..

[137] J. Sundberg,et al. Perception of just-noticeable time displacement of a tone presented in a metrical sequence at different tempos , 1993 .

[138] Jason D. Vantomme,et al. Score Following by Temporal Pattern , 1995 .

[139] R. Geary. Testing for normality. , 1947, Biometrika.

[140] Efstathios Stamatatos. A Computational Model for Discriminating Music Performers Efstathios Stamatatos , 2001 .

[141] George Tzanetakis,et al. Polyphonic audio matching and alignment for music retrieval , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[142] Roger B. Dannenberg,et al. New Techniques for Enhanced Quality of Computer Accompaniment , 1988, ICMC.

[143] Chris Chafe,et al. The shape of an instant: measuring and modeling perceptual attack time with probability density functions (if a tree falls in the forest, when did 57 people hear it make a sound?) , 2008 .

[144] David S. Watson,et al. A Machine Learning Approach to Musical Style Recognition , 1997, ICMC.

[145] Miller Puckette,et al. Score Following in Practice , 1992, ICMC.

[146] C. Krumhansl. Memory for musical surface , 1991, Memory & cognition.

[147] Rosalee K. Meyer,et al. Conceptual and Motor Learning in Music Performance , 2000, Psychological science.

[148] Xavier Rodet,et al. Improving polyphonic and poly-instrumental music to score alignment , 2003, ISMIR.

[149] Timothy M. Walker. Instrumental differences in characteristics of expressive musical performance , 2004 .

[150] Jean-Pierre Martens,et al. A comparison of human and automatic musical genre classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[151] Giovanni De Poli,et al. Sense in expressive music performance: Data acquisition, computational studies, and models , 2008 .

[152] Yoichi Muraoka,et al. Rhythm Tracking Using Multiple Hypotheses , 1994, ICMC.

[153] Jeroen Breebaart,et al. Features for audio and music classification , 2003, ISMIR.

[154] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[155] Algorithms to measure audio programme loudness and true-peak audio level , 2011 .

[156] Homer H. Chen,et al. Music emotion recognition: the role of individuality , 2007, HCM '07.

[157] Mark B. Sandler,et al. Classification of audio signals using statistical features on time and wavelet transform domains , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[158] Mark B. Sandler,et al. Sonic visualiser: an open source application for viewing, analysing, and annotating music audio files , 2010, ACM Multimedia.

[159] Jerry D. Gibson,et al. Digital coding of waveforms: Principles and applications to speech and video , 1985, Proceedings of the IEEE.

[160] Geoffroy Peeters,et al. Large-Scale Study of Chord Estimation Algorithms Based on Chroma Representation and HMM , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.

[161] Bruno H. Repp,et al. Pedal Timing and Tempo in Expressive Piano Performance: A Preliminary Investigation , 1996 .

[162] A.P. Klapuri,et al. A perceptually motivated multiple-F0 estimation method , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[163] Arshia Cont. Realtime Audio to Score Alignment for Polyphonic Music Instruments, using Sparse Non-Negative Constraints and Hierarchical HMMS , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[164] Bruno H. Repp. Quantitative Effects of Global Tempo on Expressive Timing in Music Performance: Some Perceptual Evidence , 1995 .

[165] Christian Schörkhuber. CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING , 2010 .

[166] Stephen Cranefield,et al. A Study on Feature Analysis for Musical Instrument Classification , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[167] Nicola Orio,et al. Score Following Using Spectral Analysis and Hidden Markov Models , 2001, ICMC.

[168] Alexander Lerch,et al. Software based extraction of objective parameters from music performances , 2009 .

[169] R. Parncutt. Accents and expression in piano performance , 2003 .

[170] Anssi Klapuri,et al. Automatic Transcription of Melody, Bass Line, and Chords in Polyphonic Music , 2008, Computer Music Journal.

[171] Gaël Richard,et al. Musical instrument recognition on solo performances , 2004, 2004 12th European Signal Processing Conference.

[172] Andrzej Czyzewski,et al. Representing Musical Instrument Sounds for Their Automatic Classification , 2001 .

[173] Peter Desain,et al. Does expressive timing in music performance scale proportionally with tempo? , 1994 .

[174] Eleni Lapidaki,et al. Stability of Tempo Perception in Music Listening , 2000 .

[175] Michael J. Carey,et al. A comparison of features for speech, music discrimination , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[176] B H Repp,et al. Patterns of expressive timing in performances of a Beethoven minuet by nineteen famous pianists. , 1988, The Journal of the Acoustical Society of America.

[177] Simon Dixon,et al. A Beat Tracking System for Audio Signals , 1999 .

[178] G. H. Wakefield,et al. To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[179] Guerino Mazzola,et al. Analyzing Musical Structure and Performance : A Statistical Approach , 1999 .

[180] Klaus V. Toyka,et al. Music, motor control and the brain , 2006 .

[181] Kelly Fitz,et al. Algorithms for computing the time-corrected instantaneous frequency (reassigned) spectrogram, with applications. , 2004, The Journal of the Acoustical Society of America.

[182] John A. Sloboda,et al. The performance of music , 1986 .

[183] M. Schroeder. Period histogram and product spectrum: new methods for fundamental-frequency measurement. , 1968, The Journal of the Acoustical Society of America.

[184] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[185] T. W. Parsons. Separation of speech from interfering speech by means of harmonic selection , 1976 .

[186] Yoichi Muraoka,et al. Musical understanding at the beat level: real-time beat tracking for audio signals , 1998 .

[187] Gaël Richard,et al. Methodology and Tools for the evaluation of automatic onset detection algorithms in music , 2004, ISMIR.

[188] Scott G. Norcross,et al. Objective Measures of Loudness , 2003 .

[189] Jonathan Foote,et al. Visualizing Musical Structure and Rhythm via Self-Similarity , 2001, ICMC.

[190] Jeff A. Bilmes,et al. A novel representation for rhythmic structure , 1997 .

[191] Seungmin Rho,et al. Music emotion classification and context-based music recommendation , 2010, Multimedia Tools and Applications.

[192] J C Brown. Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. , 1999, The Journal of the Acoustical Society of America.

[193] Roger B. Dannenberg. The Interpretation of MIDI Velocity , 2006, ICMC.

[194] Mark B. Sandler,et al. The Sonic Visualiser: A Visualisation Platform for Semantic Descriptors from Musical Signals , 2006, ISMIR.

[195] Pedro J. Moreno,et al. A Study of Musical Instrument Classification Using Gaussian Mixture Models and Support Vector Machines , 1999 .

[196] Miller Puckette,et al. Score Following Using the Sung Voice , 1995, ICMC.

[197] C.E. Shannon,et al. Communication in the Presence of Noise , 1949, Proceedings of the IRE.

[198] Judith C. Brown. Calculation of a constant Q spectral transform , 1991 .

[199] Pedro Cano,et al. Score-Performance Matching Using HMMs , 1999, ICMC.

[200] Frederick Dorian. The History of Music in Performance: The Art of Musical Interpretation from the Renaissance to Our Day , 1942 .

[201] J. Kantor-Martynuska,et al. Emotion-relevant characteristics of temperament and the perceived magnitude of tempo and loudness of music , 2006 .

[202] Christopher Raphael,et al. A Probabilistic Expert System for Automatic Musical Accompaniment , 2001 .

[203] Geoffroy Peeters. Time variable Tempo Detection and beat Marking , 2005, ICMC.

[204] B. Moore,et al. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. , 1983, The Journal of the Acoustical Society of America.

[205] Manfred Clynes,et al. Music as Time's Measure , 1986 .

[206] Robert O. Gjerdingen,et al. Scanning the Dial: The Rapid Recognition of Music Genres , 2008 .

[207] S. Shapiro,et al. An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[208] Patrick Susini,et al. The Timbre Toolbox: extracting audio descriptors from musical signals. , 2011, The Journal of the Acoustical Society of America.

[209] Hermann Ney,et al. Computing Mel-frequency cepstral coefficients on the power spectrum , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[210] Ton Kalker,et al. Speed-change resistant audio fingerprinting using auto-correlation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[211] Mohan S. Kankanhalli,et al. Music Key Detection for Musical Audio , 2005, 11th International Multimedia Modelling Conference.

[212] D. Temperley. 2 The Tonal Properties of Pitch-Class Sets : Tonal Implication , Tonal Ambiguity , and Tonalness , 2008 .

[213] Matti Karjalainen,et al. A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[214] Keikichi Hirose,et al. Automatic alignment of a musical score to performed music , 2001 .

[215] Renee Timmers,et al. Perception of music performance on historical and modern commercial recordings. , 2007, The Journal of the Acoustical Society of America.

[216] D. D. Greenwood. Critical Bandwidth and the Frequency Coordinates of the Basilar Membrane , 1961 .

[217] Ichiro Fujinaga,et al. Feature Selection Pitfalls and Music Classification , 2006, ISMIR.

[218] Peter Q. Pfordresher,et al. Effects of delayed auditory feedback on timing of music performance , 2002, Psychological research.

[219] Sacha Jennifer van Albada,et al. Transformation of arbitrary distributions to the normal distribution with application to EEG test–retest reliability , 2007, Journal of Neuroscience Methods.

[220] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[221] Julius O. Smith,et al. Techniques for Note Identification in Polyphonic Music , 1985, ICMC.

[222] Roberto Bresin,et al. Measurement and reproduction accuracy of computer-controlled grand pianos. , 2003, The Journal of the Acoustical Society of America.

[223] Hank Heijink,et al. The Influence of Musical Context on Tempo Rubato , 2000 .

[224] C. Palmer. Music performance. , 1997, Annual review of psychology.

[225] Ernst Terhardt,et al. Calculating virtual pitch , 1979, Hearing Research.

[226] W Goebl,et al. Melody lead in piano performance: expressive device or artifact? , 2001, The Journal of the Acoustical Society of America.

[227] Karin Dressler,et al. Tuning Frequency Estimation Using Circular Statistics , 2007, ISMIR.

[228] C. Drake,et al. Skill acquisition in music performance: relations between planning and temporal control , 2000, Cognition.

[229] Xavier Serra,et al. A system for sound analysis/transformation/synthesis based on a deterministic plus stochastic decomposition , 1989 .

[230] A. Oxenham. The Perception of Musical Tones , 2013 .

[231] Hank Heijink,et al. Robust Score-Performance Matching: Taking Advantage of Structural Information , 1997, ICMC.

[232] Grigorios Tsoumakas,et al. Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[233] Hagen Soltau,et al. Recognition of music types , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[234] W. A. Mvnso,et al. Loudness , Its Definition , Measurement and Calculation , 2004 .

[235] Caroline Palmer,et al. Temporal and Motor Transfer in Music Performance , 2003 .

[236] E. Schellenberg,et al. Effects of Musical Tempo and Mode on Arousal, Mood, and Spatial Abilities , 2002 .

[237] Bob L. Sturm. On the evaluation of music genre recognition systems , 2013 .

[238] Daniel P. W. Ellis. Extracting information from music audio , 2006, CACM.

[239] Yi-Hsuan Yang,et al. Music emotion classification: a fuzzy approach , 2006, MM '06.

[240] L. H. Shaffer,et al. Timing in Solo and Duet Piano Performances , 1984 .

[241] Thomas Sikora,et al. On the robustness of audio features for musical instrument classification , 2008, 2008 16th European Signal Processing Conference.

[242] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[243] J. M. Whittaker. The “Fourier” Theory of the Cardinal Function , 1928 .

[244] Les E. Atlas,et al. Modulation frequency features for audio fingerprinting , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[245] Ian Kaminskyj,et al. Enhanced automatic source identification of monophonic musical instrument sounds , 1996, 1996 Australian New Zealand Conference on Intelligent Information Systems. Proceedings. ANZIIS 96.

[246] E. Zwicker,et al. Analytical expressions for critical‐band rate and critical bandwidth as a function of frequency , 1980 .

[247] Bruno H. Repp. On Determining the Basic Tempo of an Expressive Music Performance , 1994 .

[248] L. Hofmann-Engl. RHYTHMIC SIMILARITY: A THEORETICAL AND EMPIRICAL APPROACH , 2002 .

[249] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[250] C. Krumhansl. A Perceptual Analysis of Mozart's Piano Sonata K. 282: Segmentation, Tension, and Musical Ideas , 1996 .

[251] John Rink,et al. Musical Performance: A Guide to Understanding , 2002 .

[252] Ernst Terhardt. THE SPINC FUNCTION FOR SCALING OF FREQUENCY IN AUDITORY MODELS , 1992 .

[253] E. Terhardt,et al. Algorithm for extraction of pitch and pitch salience from complex tonal signals , 1982 .

[254] C. Harte,et al. Detecting harmonic change in musical audio , 2006, AMCMM '06.

[255] Alexander Lerch,et al. FEAPI: A low level feature extraction plugin API , 2005 .

[256] Robin C. Laney,et al. Modelling the similarity of pitch collections with expectation tensors , 2011 .

[257] B. Atal,et al. Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .

[258] Antonio Camurri,et al. Listeners' emotional engagement with performances of a Scriabin étude: an explorative case study , 2006 .

[259] Lie Lu,et al. Automatic mood detection and tracking of music audio signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[260] Özgür Izmirli,et al. Template Based Key Finding from audio , 2005, ICMC.

[261] Meinard Müller,et al. An Efficient Multiscale Approach to Audio Synchronization , 2006, ISMIR.

[262] George Tzanetakis,et al. Stereo Panning Features for Classifying Recording Production Style , 2007, ISMIR.

[263] S. S. Stevens,et al. The Relation of Pitch to Frequency: A Revised Scale , 1940 .

[264] E. Batlle,et al. Automatic Song Identification in Noisy Broadcast Audio , 2002 .

[265] Pedro Cano,et al. Audio Fingerprinting: Concepts And Applications , 2005, Computational Intelligence for Modelling and Prediction.

[266] Chandrika Kamath,et al. Feature selection in scientific applications , 2004, KDD.

[267] S. Dixon,et al. PERCEPTUAL SMOOTHNESS OF TEMPO IN EXPRESSIVELY PERFORMED MUSIC , 2006 .

[268] Meinard Müller,et al. Score-PCM Music Synchronization Based on Extracted Score Parameters , 2004, CMMR.

[269] Klaus Obermayer,et al. A new method for tracking modulations in tonal music in audio data format , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[270] Xavier Amatriain. CLAM: A Framework for Audio and Music Application Development , 2007, IEEE Software.

[271] Antti Eronen,et al. Comparison of features for musical instrument recognition , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[272] G. Von Bismarck,et al. Sharpness as an attribute of the timbre of steady sounds , 1974 .

[273] S. Mallat. A wavelet tour of signal processing , 1998 .

[274] Gaël Richard,et al. Temporal Integration for Audio Classification With Application to Musical Instrument Classification , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[275] Jan Larsen,et al. Decision time horizon for music genre classification using short time features , 2004, 2004 12th European Signal Processing Conference.

[276] Ning Hu,et al. Polyphonic Audio Matching for Score Following and Intelligent Audio Editors , 2003, ICMC.

[277] Reinhard Kopiez,et al. REALTIME ANALYSIS OF DYNAMIC SHAPING , 2000 .

[278] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .

[279] B. Repp. Patterns of note onset asynchronies in expressive piano performance. , 1996, The Journal of the Acoustical Society of America.

[280] Matti Karjalainen,et al. Multi-pitch and periodicity analysis model for sound separation and auditory scene analysis , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[281] Roger A. Kendall,et al. The Communication of Musical Expression , 1990 .

[282] R. J. Siegel,et al. A replication of the mel scale of pitch. , 1965, The American journal of psychology.

[283] Miller Puckette,et al. Synthetic Rehearsal: Training the Synthetic Performer , 1985, ICMC.

[284] F. Itakura,et al. Minimum prediction residual principle applied to speech recognition , 1975 .

[285] Peter Desain,et al. Rhythm Quantization for Transcription , 2000, Computer Music Journal.

[286] Axel Röbel,et al. Signal decomposition by means of classification of spectral peaks , 2004, ICMC.

[287] L. V. Immerseel,et al. Digital implementation of linear gammatone filters: Comparison of design methods , 2003 .

[288] S. Dixon,et al. PINPOINTING THE BEAT: TAPPING TO EXPRESSIVE PERFORMANCES , 2002 .

[289] G. Widmer,et al. ON THE EVALUATION OF PERCEPTUAL SIMILARITY MEASURES FOR MUSIC , 2003 .

[290] Mathieu Lagrange,et al. Sinusoidal Parameter Extraction and Component Selection in a Non Stationary Model , 2002 .

[291] K. Scherer,et al. Emotions evoked by the sound of music: characterization, classification, and measurement. , 2008, Emotion.

[292] Thippur V. Sreenivas,et al. Music instrument recognition: from isolated notes to solo phrases , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[293] Hajime Kobayashi,et al. Weighted autocorrelation for pitch extraction of noisy speech , 2001, IEEE Trans. Speech Audio Process..

[294] Joseph Berkson. Tests of significance considered as evidence , 2003 .

[295] Jean-Pierre Martens,et al. AUDIO CHORD EXTRACTION USING A PROBABILISTIC MODEL , 2009 .

[296] I. Stravinsky. A causal algorithm for beat-tracking A causal algorithm for beat-tracking , 2002 .

[297] J. Stephen Downie,et al. Exploring Mood Metadata: Relationships with Genre, Artist and Usage Metadata , 2007, ISMIR.

[298] Emery Schubert. Update of the Hevner Adjective Checklist , 2003, Perceptual and motor skills.

[299] Xavier Rodet,et al. Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components , 1998 .

[300] T. W. Anderson,et al. Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes , 1952 .

[301] Pau Arumí,et al. CLAM, yet another library for audio and music processing? , 2002, OOPSLA '02.

[302] J. Weil. AN HMM-BASED AUDIO CHORD DETECTION SYSTEM ATTENUATING THE MAIN MELODY , 2008 .

[303] P. P. Vaidyanathan,et al. The Theory of Linear Prediction , 2008, Synthesis Lectures on Signal Processing.

[304] zplane. de. ON THE EVALUATION OF AUTOMATIC ONSET TRACKING SYSTEMS , 2005 .

[305] Simon Dixon. A Dynamic Modelling Approach to Music Recognition , 1996, ICMC.

[306] K. MacDorman,et al. Automatic Emotion Prediction of Song Excerpts: Index Construction, Algorithm Design, and Empirical Comparison , 2007 .

[307] B H Repp. A microcosm of musical expression. III. Contributions of timing and dynamics to the aesthetic impression of pianists' performances of the initial measures of Chopin's Etude in E major. , 1999, The Journal of the Acoustical Society of America.

[308] T. Sikora,et al. On the Use of Auditory Representations for Sparsity-Based Sound Source Separation , 2005, 2005 5th International Conference on Information Communications & Signal Processing.

[309] Fabio Vignoli,et al. Digital Music Interaction Concepts: A User Study , 2004, ISMIR.

[310] E. de Boer,et al. On cochlear encoding: potentialities and limitations of the reverse-correlation technique. , 1978, The Journal of the Acoustical Society of America.

[311] P. Kleinginna,et al. A categorized list of emotion definitions, with suggestions for a consensual definition , 1981 .

[312] Eric D. Scheirer,et al. Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[313] R. Shepard. Circularity in Judgments of Relative Pitch , 1964 .

[314] W. Dowling. Emotion and Meaning in Music , 2008 .

[315] Hugh D. Luke. Signalubertragung: Grundlagen Der Digitalen Und Analogen Nachrichtenubertragungssysteme , 1995 .

[316] C.-C. Jay Kuo,et al. Hierarchical system for content-based audio classification and retrieval , 1998, Other Conferences.

[317] A. Oppenheim,et al. Computation of spectra with unequal resolution using the fast Fourier transform , 1971 .

[318] Jamie Bullock,et al. Libxtract: a Lightweight Library for audio Feature Extraction , 2007, ICMC.

[319] Alexander Lerch. On the Requirement of Automatic Tuning Frequency Estimation , 2006, ISMIR.

[320] John William Gordon. Perception of attack transients in musical tones , 1984 .

[321] R. Meddis,et al. A unitary model of pitch perception. , 1997, The Journal of the Acoustical Society of America.

[322] Shingo Uchihashi,et al. The beat spectrum: a new approach to rhythm analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[323] Peter Desain,et al. On tempo tracking: Tempogram Representation and Kalman filtering , 2000, ICMC.

[324] Mark R. Every,et al. Separation of musical sources and structure from single-channel polyphonic recordings , 2006 .

[325] Alexander Lerch,et al. Hierarchical Automatic Audio Signal Classification , 2004 .

[326] W. Andrew Schloss,et al. On the automatic transcription of percussive music , 1985 .

[327] Simon Dixon,et al. A Lightweight Multi-agent Musical Beat Tracking System , 2000, PRICAI.

[328] Tao Li,et al. Content-based music similarity search and emotion detection , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[329] Mark B. Sandler,et al. A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[330] R Meddis,et al. Modeling the identification of concurrent vowels with different fundamental frequencies. , 1992, The Journal of the Acoustical Society of America.

[331] Ira J. Hirsh,et al. Auditory Perception of Temporal Order , 1959 .

[332] Meinard Müller,et al. Information retrieval for music and motion , 2007 .

[333] R. Timmers. Communication of (e)motion through performance: Two case studies , 2007 .

[334] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[335] Judith C. Brown. Determination of the meter of musical scores by autocorrelation , 1993 .

[336] D. Cox,et al. An Analysis of Transformations , 1964 .

[337] F. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.

[338] M. Davies,et al. Complex domain onset detection for musical signals , 2003 .

[339] N. Scaringella,et al. Automatic genre classification of music content: a survey , 2006, IEEE Signal Process. Mag..

[340] Giovanni De Poli,et al. Score-Independent Audio Features for Description of Music Expression , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[341] E. Chew. Towards a mathematical model of tonality , 2000 .

[342] Leon Cohen,et al. Fitting the Mel scale , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[343] Ruth Rasch,et al. Synchronization in performed ensemble music , 1979 .

[344] Steffen Pauws,et al. Musical key extraction from audio , 2004, ISMIR.

[345] Joan Serrà,et al. From Low-Level to High-Level: Comparative Study of Music Similarity Measures , 2009, 2009 11th IEEE International Symposium on Multimedia.

[346] Patrick Flandrin,et al. Improving the readability of time-frequency and time-scale representations by the reassignment method , 1995, IEEE Trans. Signal Process..

[347] M A Pitt,et al. The perceived similarity of auditory polyrhythms , 1987, Perception & psychophysics.

[348] Zhouyu Fu,et al. A Survey of Audio-Based Music Classification and Annotation , 2011, IEEE Transactions on Multimedia.

[349] Nicola Orio,et al. Music Retrieval: A Tutorial and Review , 2006, Found. Trends Inf. Retr..

[350] Gerhard Widmer,et al. Modeling the rational basis of musical expression , 1995 .

[351] Yueting Zhuang,et al. Music information retrieval by detecting mood via computational media aesthetics , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[352] Alan R. Jones,et al. Fast Fourier Transform , 1970, SIGP.

[353] B H Repp,et al. Acoustics, perception, and production of legato articulation on a computer-controlled grand piano. , 1997, The Journal of the Acoustical Society of America.

[354] Julius O. Smith,et al. A flexible sampling-rate conversion method , 1984, ICASSP.

[355] Nicola Dibben,et al. Motivic Structure and the Perception of Similarity , 2001 .

[356] Daniel P. W. Ellis,et al. Speech/music discrimination based on posterior probability features , 1999, EUROSPEECH.

[357] A. Michelson,et al. Fourier's Series , 1898, Nature.

[358] H. Olson. The Measurement of Loudness , 1972 .

[359] Yoram Singer,et al. Learning to Align Polyphonic Music , 2004, ISMIR.

[360] O. Lartillot,et al. A MATLAB TOOLBOX FOR MUSICAL FEATURE EXTRACTION FROM AUDIO , 2007 .

[361] Peter Q Pfordresher,et al. Auditory feedback in music performance: the role of melodic structure and musical skill. , 2005, Journal of experimental psychology. Human perception and performance.

[362] Markus Cremer. A System for Harmonic Analysis of Polyphonic Music , 2004 .

[363] Jacqueline Walker,et al. TIME DOMAIN NOTE AVERAGE ENERGY BASED MUSIC ONSET DETECTION , 2003 .

[364] Beth Logan,et al. Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[365] George Tzanetakis,et al. MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[366] J. Russell. A circumplex model of affect. , 1980 .

[367] Aaron Williamon,et al. Time-Dependent Characteristics of Performance Evaluation , 2007 .

[368] 岸憲史,et al. The Physics of Musical Instruments (2nd ed.), Neville H. Fletcher and Thomas D. Rossing共著, Springer-Verlag, New York, 1998, 756頁 , 2000 .

[369] François Pachet,et al. A taxonomy of musical genres , 2000, RIAO.

[370] Gerhard Widmer,et al. Playing Mozart by Analogy: Learning Multi-level Timing and Dynamics Strategies , 2003 .

[371] A. Czyzewski,et al. Frequency based criterion for distinguishing tonal and noisy spectral components , 2010 .

[372] Anders Friberg,et al. Measurements and models of musical articulation , 2003 .

[373] Hideki Kawahara,et al. YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[374] François Pachet,et al. Music Similarity Measures: What's the use? , 2002, ISMIR.

[375] Ton Kalker,et al. A Highly Robust Audio Fingerprinting System , 2002, ISMIR.

[376] Jonathan Foote,et al. A Similarity Measure for Automatic Audio Classification , 1997 .

[377] M. Ross,et al. Average magnitude difference function pitch extractor , 1974 .

[378] Jürgen Herre,et al. Advanced Audio Identification Using MPEG-7 Content Description , 2001 .

[379] Guy J. Brown,et al. A missing feature approach to instrument identification in polyphonic music , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[380] Peter Knees,et al. The CoMIRVA Toolkit for Visualizing Music-Related Data , 2007, EuroVis.

[381] Marc Leman,et al. Prediction of Musical Affect Using a Combination of Acoustic Structural Cues , 2005 .

[382] Malcolm Slaney,et al. An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank , 1997 .

[383] Emanuele Pollastri,et al. Musical Instrument Timbres Classification with Spectral Features , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[384] Mathieu Lagrange,et al. Estimating the Instantaneous Frequency of Sinusoidal Components Using Phase-Based Methods , 2007 .

[385] Tsuhan Chen,et al. Audio feature extraction and analysis for scene classification , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[386] Rebecca B. MacLeod,et al. Influences of Dynamic Level and Pitch Register on the Vibrato Rates and Widths of Violin and Viola Players , 2008 .

[387] R. Jackendoff,et al. A Generative Theory of Tonal Music , 1985 .

[388] Eitan Ornoy,et al. An empirical study of intonation in performances of J . S . Bach ' s Sarabandes : temperament , ' melodic charge ' and ' melodic intonation , 2007 .

[389] Peter Desain,et al. A (De)Composable Theory of Rhythm Perception , 1992 .

[390] Matthew J. Dovey. Analysis of Rachmaninoff's Piano Performances Using Inductive Logic Programming (Extended Abstract) , 1995, ECML.

[391] Diana Deutsch,et al. Pitch circularity from tones comprising full harmonic series. , 2008, The Journal of the Acoustical Society of America.

[392] Roger B. Dannenberg,et al. A Stochastic Method of Tracking a Vocal Performer , 1997, ICMC.

[393] C. Palmer. Mapping musical thought to musical performance. , 1989, Journal of experimental psychology. Human perception and performance.

[394] X. Rodet. EFFICIENT SPECTRAL ENVELOPE ESTIMATION AND ITS APPLICATION TO PITCH SHIFTING AND ENVELOPE PRESERVATION , 2005 .

[395] Meinard Müller,et al. Towards an Efficient Algorithm for Automatic Score-to-Audio Synchronization , 2004, ISMIR.

[396] B. Repp. The Art of Inaccuracy: Why Pianists' Errors Are Difficult to Hear , 1996 .

[397] A. Aertsen,et al. Spectro-temporal receptive fields of auditory neurons in the grassfrog , 1980, Biological Cybernetics.

[398] Eric D. Scheirer,et al. Extracting Expressive Performance Information from Recorded Music , 1995 .

[399] Kunio Kashino,et al. Very quick audio searching: introducing global pruning to the Time-Series Active Search , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[400] a Christine,et al. Virtual Virtuosity Studies in Automatic Music Performance , 2000 .

[401] Christopher Raphael,et al. Automatic Segmentation of Acoustic Musical Signals Using Hidden Markov Models , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[402] Malcolm Slaney,et al. A Unified System for Chord Transcription and Key Extraction Using Hidden Markov Models , 2007, ISMIR.

[403] D. Kirovski,et al. Fingerprinting and forensic analysis of multimedia , 2004, MULTIMEDIA '04.

[404] Hank Heijink,et al. Matching Scores and Performances , 2001 .

[405] Roger B. Dannenberg,et al. Tracking Musical Beats in Real Time , 1990, ICMC.

[406] B. Repp. Effects of Auditory Feedback Deprivation on Expressive Piano Performance , 1999 .

[407] Jeffrey J. Scott,et al. MUSIC EMOTION RECOGNITION: A STATE OF THE ART REVIEW , 2010 .

[408] A. de Cheveigné,et al. The dependency of timbre on fundamental frequency. , 2003, The Journal of the Acoustical Society of America.

[409] John C. Platt,et al. Extracting noise-robust features from audio data , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[410] Guy J. Brown,et al. Instrument recognition in accompanied sonatas and concertos , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[411] Edward W. Large,et al. Perceiving temporal regularity in music , 2002, Cogn. Sci..

[412] Nicola Orio,et al. Score Following: State of the Art and New Developments , 2003, NIME.

[413] John Saunders,et al. Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[414] K. Scherer. What are emotions? And how can they be measured? , 2005 .

[415] Simon Dixon,et al. Beat Tracking with Musical Knowledge , 2000, ECAI.

[416] S McAdams,et al. Similarity, Invariance, and Musical Variation , 2001, Annals of the New York Academy of Sciences.

[417] Christopher Raphael,et al. A Hybrid Graphical Model for Aligning Polyphonic Audio with Musical Scores , 2004, ISMIR.

[418] Robert M. Haralick,et al. Feature normalization and likelihood-based similarity measures for image retrieval , 2001, Pattern Recognit. Lett..

[419] Ichiro Fujinaga,et al. Study of Expression and Individuality in Music Performance Using Normative Data Derived from MIDI Recordings of Piano Music , 2007 .

[420] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).