Signal Processing Methods for Audio Classification and Music Content Analysis

Signal processing methods for audio classification and music content analysis are developed in this thesis. Audio classification is here understood as the process of assigning a discrete category label to an unknown recording. Two specific problems of audio classification are considered: musical instrument recognition and context recognition. In the former, the system classifies an audio recording according to the instrument, e.g. violin, flute, piano, that produced the sound. The latter task is about classifying an environment, such a car, restaurant, or library, based on its ambient audio background. In the field of music content analysis, methods are presented for music meter analysis and chorus detection. Meter analysis methods consider the estimation of the regular pattern of strong and weak beats in a piece of music. The goal of chorus detection is to locate the chorus segment in music which is often the catchiest and most memorable part of a song. These are among the most important and readily commercially applicable content attributes that can be automatically analyzed from music signals. For audio classification, several features and classification methods are proposed and evaluated. In musical instrument recognition, we consider methods to improve the performance of a baseline audio classification system that uses mel-frequency cepstral coefficients and their first derivatives as features, and continuous-density hidden Markov models (HMMs) for modeling the feature distributions. Two improvements are proposed to increase the performance of this baseline system. First, transforming the features to a base with maximal statistical independence using independent component analysis. Secondly, discriminative training is shown to further improve the recognition accuracy of the system. For musical meter analysis, three methods are proposed. The first performs meter analysis jointly at three different time scales: at the temporally atomic tatum pulse level, at the tactus pulse level, which corresponds to the tempo of a piece, and at the musical measure level. The features obtained from an accent feature analyzer and a bank of combfilter resonators are processed by a novel probabilistic model which rep-

[1]  William A. Sethares,et al.  Meter and Periodicity in Musical Performance , 2001 .

[2]  Laurent Daudet,et al.  Automatic Instrument Recognition in a Polyphonic Mixture Using Sparse Representations , 2007, ISMIR.

[3]  Ichiro Fujinaga,et al.  Toward Real-time Recognition of Acoustic Musical Instruments , 1999, ICMC.

[4]  Stephen Cox,et al.  Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[5]  Gregory H. Wakefield,et al.  Audio thumbnailing of popular music using chroma-based representations , 2005, IEEE Transactions on Multimedia.

[6]  Michael A. Casey,et al.  General sound classification and similarity in MPEG-7 , 2001, Organised Sound.

[7]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[8]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  John F. Kolen,et al.  Resonance and the Perception of Musical Meter , 1994, Connect. Sci..

[10]  Emmanuel Vincent,et al.  Low Bit-Rate Object Coding of Musical Audio Using Bayesian Harmonic Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Alex Pentland,et al.  Unsupervised clustering of ambulatory audio and video , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[12]  H. Strube Linear prediction on a warped frequency scale , 1980 .

[13]  Masataka Goto,et al.  Ijcai-97 Workshop on Computational Auditory Scene Analysis Real-time Rhythm Tracking for Drumless Audio Signals | Chord Change Detection for Musical Decisions | , 1997 .

[14]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[15]  Paul Dourish,et al.  Introduction to This Special Issue on Context-Aware Computing , 2001, Hum. Comput. Interact..

[16]  Masataka Goto,et al.  A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Juha T. Tuomi,et al.  Audio-based context awareness - acoustic modeling and perceptual evaluation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Avery Wang,et al.  The Shazam music recognition service , 2006, CACM.

[20]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[21]  Jarno Seppänen,et al.  INTERACTIVE AND CONTEXT-AWARE MOBILE MUSIC EXPERIENCES , 2008 .

[22]  Anssi Klapuri,et al.  Automatic Classification of Pitched Musical Instrument Sounds , 2006 .

[23]  Mike E. Davies,et al.  Musical Instrument Identification using LSF and K-means , 2005 .

[24]  Jimmie D. Lawson,et al.  Presentation , 2000, MFCSIT.

[25]  Ben P. Milner,et al.  Acoustic environment classification , 2006, TSLP.

[26]  Xavier Rodet,et al.  Timbre Recognition with Combined Stationary and Temporal Features , 1998, ICMC.

[27]  Dirk Moelants,et al.  A computer system for the automatic detection of perceptual onsets in a musical signal , 1997 .

[28]  Jarno Seppänen,et al.  Computational models of musical meter recognition , 2001 .

[29]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[30]  Ichiro Fujinaga,et al.  Realtime Recognition of Orchestral Instruments , 2000, International Conference on Mathematics and Computing.

[31]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[32]  Jan Rohden,et al.  Low Complexity Musical Meter Estimation from Polyphonic Music , 2004 .

[33]  B. Ong Structural analysis and segmentation of music signals , 2007 .

[34]  J. Grey Multidimensional perceptual scaling of musical timbres. , 1977, The Journal of the Acoustical Society of America.

[35]  Stephen W. Hainsworth,et al.  Beat tracking with particle filtering algorithms , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[36]  Kristof Van Laerhoven,et al.  Real-time analysis of data from many sensors with neural networks , 2001, Proceedings Fifth International Symposium on Wearable Computers.

[37]  Namunu Chinthaka Maddage Automatic structure detection for popular music , 2006, IEEE Multimedia.

[38]  Matthew E. P. Davies,et al.  Context-Dependent Beat Tracking of Musical Audio , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  D. Ellis Learning the meaning of music , 2005 .

[40]  Ali Taylan Cemgil,et al.  Monte Carlo Methods for Tempo Tracking and Rhythm Quantization , 2011, J. Artif. Intell. Res..

[41]  Jarno Sepp nen TATUM GRID ANALYSIS OF MUSICAL SIGNALS , 2001 .

[42]  Keld K. Jensen,et al.  Timbre Models of Musical Sounds , 1999 .

[43]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[44]  Keith Dana Martin,et al.  Sound-source recognition: a theory and computational model , 1999 .

[45]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[46]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[47]  Aaron E. Rosenberg,et al.  On the use of instantaneous and transitional spectral information in speaker recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[48]  David Burshtein,et al.  A discriminative training algorithm for hidden Markov models , 2004, IEEE Transactions on Speech and Audio Processing.

[49]  Emilia Gómez Gutiérrez,et al.  Tonal description of music audio signals , 2006 .

[50]  Jeff A. Bilmes,et al.  Timing is of the essence : perceptual and computational techniques for representing, learning, and reproducing expressive timing in percussive rhythm , 1993 .

[51]  Guy J. Brown,et al.  A missing feature approach to instrument identification in polyphonic music , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[52]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[53]  Guy J. Brown,et al.  Analysis of Musical Audio Signals , 2006 .

[54]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[55]  Arto Lehtiniemi Evaluating SuperMusic: streaming context-aware mobile music service , 2008, ACE '08.

[56]  Unto K. Laine,et al.  Frequency-warped signal processing for audio applications , 2000 .

[57]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[58]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[59]  Lie Lu,et al.  Repeating pattern discovery and structure analysis from acoustic music data , 2004, MIR '04.

[60]  Ichiro Fujinaga,et al.  Machine recognition of timbre using steady-state tone of acoustic musical instruments , 1998, ICMC.

[61]  Douglas Eck Beat Tracking using an Autocorrelation Phase Matrix , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[62]  Jordi Bonada,et al.  RHYTHMIC EXPRESSIVENESS TRANSFORMATIONS OF AUDIO RECORDINGS: SWING MODIFICATIONS , 2003 .

[63]  Xavier Rodet,et al.  Toward Automatic Music Audio Summary Generation from Signal Analysis , 2002, ISMIR.

[64]  R. Jackendoff,et al.  A Generative Theory of Tonal Music , 1985 .

[65]  Gerhard Widmer,et al.  From Rhythm Patterns to Perceived Tempo , 2007, ISMIR.

[66]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[67]  J. Devin McAuley On the Perception of Time as Phase: Toward an Adaptive-Oscillator Model of Rhythm , 1995 .

[68]  Nikos Fakotakis,et al.  Independent component analysis applied to feature extraction for robust automatic speech recognition , 2000 .

[69]  Malcolm D. Macleod,et al.  Particle Filtering Applied to Musical Tempo Tracking , 2004, EURASIP J. Adv. Signal Process..

[70]  Daniel P. W. Ellis,et al.  Beat Tracking by Dynamic Programming , 2007 .

[71]  Matija Marolt,et al.  A Mid-level Melody-based Representation for Calculating Audio Similarity , 2006, ISMIR.

[72]  R. Meddis Simulation of mechanical to neural transduction in the auditory receptor. , 1986, The Journal of the Acoustical Society of America.

[73]  Simon Dixon,et al.  Evaluation of the Audio Beat Tracking System BeatRoot , 2007 .

[74]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[75]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[76]  Bill Gardner,et al.  HRTF Measurements of a KEMAR Dummy-Head Microphone , 1994 .

[77]  Mark B. Sandler,et al.  Structural Segmentation of Musical Audio by Constrained Clustering , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[78]  Christopher Raphael,et al.  Musical Accompaniment Systems , 2004 .

[79]  Mark B. Sandler,et al.  A Markov-Chain Monte-Carlo Approach to Musical Audio Segmentation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[80]  George Tzanetakis,et al.  Multifeature audio segmentation for browsing and annotation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[81]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[82]  Holger Crysandt,et al.  Temporal audio segmentation using MPEG-7 descriptors , 2003, IS&T/SPIE Electronic Imaging.

[83]  Jonna Häkkilä,et al.  Usability with context-aware mobile applications:case studies and design guidelines , 2006 .

[84]  Petri Toiviainen,et al.  An interactive MIDI accompanist , 1998 .

[85]  Anssi Klapuri,et al.  Multiple Fundamental Frequency Estimation by Summing Harmonic Amplitudes , 2006, ISMIR.

[86]  Gaël Richard,et al.  Instrument recognition in polyphonic music , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[87]  Ye Wang,et al.  A compressed domain beat detector using MP3 audio bitstreams , 2001, MULTIMEDIA '01.

[88]  Anssi Klapuri,et al.  Music Structure Analysis Using a Probabilistic Fitness Measure and an Integrated Musicological Model , 2008, ISMIR.

[89]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[90]  Klaus Obermayer,et al.  A new method for tracking modulations in tonal music in audio data format , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[91]  Panu Somervuo,et al.  Experiments with linear and nonlinear feature transformations in HMM based phone recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[92]  Antti Eronen,et al.  Comparison of features for musical instrument recognition , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[93]  Vincent Fontaine,et al.  Automatic classification of environmental noise events by hidden Markov models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[94]  Tong Zhang,et al.  Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing , 2001 .

[95]  Bill N. Schilit,et al.  The PARCTAB mobile computing system , 1993, Proceedings of IEEE 4th Workshop on Workstation Operating Systems. WWOS-III.

[96]  M. Davies,et al.  Complex domain onset detection for musical signals , 2003 .

[97]  Kristoffer Jensen Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony , 2007, EURASIP J. Adv. Signal Process..

[98]  Geoffroy Peeters,et al.  Template-Based Estimation of Time-Varying Tempo , 2007, EURASIP J. Adv. Signal Process..

[99]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[100]  Simon O'Keefe,et al.  On Techniques for Content-Based Visual Annotation to Aid Intra-Track Music Navigation , 2005, ISMIR.

[101]  Guillaume Boutard,et al.  Browsing inside a Music Track, the Experimentation Case Study , 2006 .

[102]  Emilia Gómez,et al.  Tonal Description of Polyphonic Audio for Music Content Processing , 2006, INFORMS J. Comput..

[103]  Stephen Hainsworth Beat Tracking and Musical Metre Analysis , 2006 .

[104]  J. Himberg,et al.  Using PCA and ICA for exploratory data analysis in situation awareness , 2001, Conference Documentation International Conference on Multisensor Fusion and Integration for Intelligent Systems. MFI 2001 (Cat. No.01TH8590).

[105]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[106]  Eric Clarke,et al.  Rhythm and Timing in Music , 1999 .

[107]  R. Shepard Circularity in Judgments of Relative Pitch , 1964 .

[108]  A. Klapuri,et al.  Music structure analysis by finding repeated parts , 2006, AMCMM '06.

[109]  Xavier Rodet,et al.  MUSICAL INSTRUMENT IDENTIFICATION IN CONTINUOUS RECORDINGS , 2004 .

[110]  Xavier Rodet,et al.  Studies and Improvements in Automatic Classification of Musical Sound Samples , 2003, ICMC.

[111]  David Felix Rosenthal Machine rhythm: computer emulation of human rhythm perception , 1992 .

[112]  Moncef Gabbouj,et al.  Weighted median filters: a tutorial , 1996 .

[113]  Christopher Raphael,et al.  Automated Rhythm Transcription , 2001, ISMIR.

[114]  Ray Meddis,et al.  Virtual pitch and phase sensitivity of a computer model of the auditory periphery , 1991 .

[115]  T. Virtanen,et al.  Probabilistic Model Based Similarity Measures for Audio Query-by-Example , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[116]  Ian Kaminskyj,et al.  Automatic source identification of monophonic musical instrument sounds , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[117]  Nick Zacharov,et al.  Unraveling the perception of spatial sound reproduction: Techniques and experimental design , 2001 .

[118]  Guanling Chen,et al.  A Survey of Context-Aware Mobile Computing Research , 2000 .

[119]  Pedro Cano,et al.  Pulse-dependent analyses of percussive music , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[120]  Daniel Dominic Sleator,et al.  Modeling Meter and Harmony: A Preference-Rule Approach , 1999, Computer Music Journal.

[121]  Christian Uhle,et al.  ESTIMATION OF TEMPO, MICRO TIME AND TIME SIGNATURE FROM PERCUSSIVE MUSIC , 2003 .

[122]  Anssi Klapuri,et al.  Modelling of note events for singing transcription , 2004, SAPA@INTERSPEECH.

[123]  Pattie Maes,et al.  Situational Awareness from Environmental Sounds , 1997 .

[124]  Beth Logan,et al.  Music summarization using key phrases , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[125]  Norbert Dillier,et al.  Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis , 2005, EURASIP J. Adv. Signal Process..

[126]  Mark B. Sandler,et al.  Extraction of High-Level Musical Structure From Audio Data and Its Application to Thumbnail Generation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[127]  C.-C. Jay Kuo,et al.  Similarity matrix processing for music structure analysis , 2006, AMCMM '06.

[128]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[129]  Guy J. Brown,et al.  A blackboard architecture for computational auditory scene analysis , 1999, Speech Commun..

[130]  R. Parncutt A Perceptual Model of Pulse Salience and Metrical Accent in Musical Rhythms , 1994 .

[131]  Johan Himberg,et al.  Collaborative context determination to support mobile terminal applications , 2002, IEEE Wirel. Commun..

[132]  D. Howard,et al.  Speech and audio signal processing: processing and perception of speech and music [Book Review] , 2000 .

[133]  Stuart J. Russell Handbook of Perception and Cognition , 2011 .

[134]  Jarno Seppänen,et al.  Joint Beat & Tatum Tracking from Music Signals , 2006, ISMIR.

[135]  Anssi Klapuri,et al.  Conventional and periodic N-grams in the transcription of drum sequences , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[136]  Simon J. Godsill,et al.  Detection of abrupt spectral changes using support vector machines an application to audio signal segmentation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[137]  Miguel A. Alonso,et al.  Accurate tempo estimation based on harmonic + noise decomposition , 2007, EURASIP J. Adv. Signal Process..

[138]  Kristoffer Jensen,et al.  Beat estimation on the beat , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[139]  Gaël Richard,et al.  Musical instrument recognition by pairwise classification strategies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[140]  B. S. Manjunath,et al.  Introduction to MPEG-7: Multimedia Content Description Interface , 2002 .

[141]  Eric D. Scheirer,et al.  Using musical knowledge to extract expressive performance information from audio recordings , 1998 .

[142]  G. Soete,et al.  Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes , 1995, Psychological research.

[143]  Daniel P. W. Ellis,et al.  Accessing Minimal-Impact Personal Audio Archives , 2006, IEEE MultiMedia.

[144]  S. Handel,et al.  Chapter 12 – Timbre Perception and Auditory Object Identification , 1995 .

[145]  Jean Laroche,et al.  Efficient Tempo and Beat Tracking in Audio Recordings , 2003 .

[146]  Simon Dixon,et al.  A Review of Automatic Rhythm Description Systems , 2005, Computer Music Journal.

[147]  János Csirik,et al.  Fast Independent Component Analysis in Kernel Feature Spaces , 2001, SOFSEM.

[148]  Peter Desain,et al.  Computational models of beat induction: the rule-based approach , 1999 .

[149]  Youngmoo E. Kim,et al.  Musical instrument identification: A pattern‐recognition approach , 1998 .

[150]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[151]  Takuya Fujishima,et al.  Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.

[152]  Elias Pampalk,et al.  Computational Models of Music Similarity and their Application in Music Information Retrieval , 2006 .

[153]  J C Brown,et al.  Feature dependence in the automatic identification of musical woodwind instruments. , 2001, The Journal of the Acoustical Society of America.

[154]  J. Beauchamp,et al.  Fundamental frequency estimation of musical signals using a two‐way mismatch procedure , 1994 .

[155]  Peter Kabal,et al.  Frame level noise classification in mobile environments , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[156]  Dirk Moelants,et al.  Extracting the perceptual tempo from music , 2004, ISMIR.

[157]  Perfecto Herrera-Boyer,et al.  Automatic Classification of Musical Instrument Sounds , 2003 .

[158]  Matthew Cooper,et al.  Summarizing popular music via structural similarity analysis , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[159]  Lie Lu,et al.  Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..

[160]  Judith C. Brown Determination of the meter of musical scores by autocorrelation , 1993 .

[161]  Yoichi Muraoka,et al.  Issues in Evaluating Beat Tracking Systems , 2005 .

[162]  Petri Toiviainen,et al.  Autocorrelation in meter induction: the role of accent structure. , 2006, The Journal of the Acoustical Society of America.

[163]  Jyri Huopaniemi Future of Personal Audio: Smart Applications and Immersive Communication , 2007 .

[164]  Daniel P. W. Ellis,et al.  A tempo-insensitive distance measure for cover song identification based on chroma features , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[165]  Daniel Patrick Whittlesey Ellis,et al.  Prediction-driven computational auditory scene analysis , 1996 .

[166]  Robert Bregovic,et al.  Multirate Systems and Filter Banks , 2002 .

[167]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[168]  Christopher S. Lee The perception of metrical structure: Experimental evidence and a new model , 1987 .

[169]  H. Indefrey,et al.  Design and evaluation of double-transform pitch determination algorithms with nonlinear distortion in the frequency domain-preliminary results , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[170]  Anssi Klapuri,et al.  Pitch estimation using multiple independent time-frequency windows , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[171]  Douglas Eck A Positive-Evidence Model for Classifying Rhythmical Patterns , 2000 .

[172]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[173]  Yoichi Muraoka,et al.  Musical understanding at the beat level: real-time beat tracking for audio signals , 1998 .

[174]  J C Brown Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. , 1999, The Journal of the Acoustical Society of America.

[175]  Pedro J. Moreno,et al.  A Study of Musical Instrument Classification Using Gaussian Mixture Models and Support Vector Machines , 1999 .

[176]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[177]  Yoichi Muraoka,et al.  Real-time beat tracking for drumless audio signals: Chord change detection for musical decisions , 1999, Speech Commun..

[178]  Fathi M. A. Salam,et al.  Sensor fusion by principal and independent component decomposition using neural networks , 1999, Proceedings. 1999 IEEE/SICE/RSJ. International Conference on Multisensor Fusion and Integration for Intelligent Systems. MFI'99 (Cat. No.99TH8480).

[179]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[180]  L. V. Noorden,et al.  Resonance in the Perception of Musical Pulse , 1999 .

[181]  Mark D. Plumbley,et al.  PROBABILITY AS METADATA: EVENT DETECTION IN MUSIC USING ICA AS A CONDITIONAL DENSITY MODEL , 2003 .

[182]  François Pachet,et al.  The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music. , 2007, The Journal of the Acoustical Society of America.

[183]  Tae Hong Park,et al.  RADIAL / ELLIPTICAL BASIS FUNCTION NEURAL NETWORKS FOR TIMBRE CLASSIFICATION , 2005 .

[184]  Bryan Pardo,et al.  Learning Musical Instruments from Mixtures of Audio with Weak Labels , 2008, ISMIR.

[185]  Paolo Prandoni,et al.  Sonological models for timbre characterization , 1997 .

[186]  A. Klapuri,et al.  ACOUSTIC FEATURES FOR MUSIC PIECE STRUCTURE ANALYSIS , 2011 .

[187]  C.-C. Jay Kuo,et al.  Musical beat tracking via Kalman filtering and noisy measurements selection , 2008, 2008 IEEE International Symposium on Circuits and Systems.