Separation of musical sources and structure from single-channel polyphonic recordings

The thesis deals principally with the separation of pitched sources from singlechannel polyphonic musical recordings. The aim is to extract from a mixture a set of pitched instruments or sources, where each source contains a set of similarly sounding events or notes, and each note is seen as comprising partial, transient and noise content. The work also has implications for separating nonpitched or percussive sounds from recordings, and in general, for unsupervised clustering of a list of detected audio events in a recording into a meaningful set of source classes. The alignment of a symbolic score/MIDI representation with the recording constitutes a pre-processing stage. The three main areas of contribution are: firstly, the design of harmonic tracking algorithms and spectralfiltering techniques for removing harmonics from the mixture, where particular attention has been paid to the case of harmonics which are overlapping in frequency. Secondly, some studies will be presented for separating transient attacks from recordings, both when they are distinguishable from and when they are overlapping in time with other transients. This section also includes a method which proposes that the behaviours of the harmonic and noise components of a note are partially correlated. This is used to share the noise component of a mixture of pitched notes between the interfering sources. Thirdly, unsupervised clustering has been applied to the task of grouping a set of separated notes from the recording into sources, where notes belonging to the same source ideally have similar features or attributes. Issues relating to feature computation, feature selection, dimensionality and dependence on a symbolic music representation are explored. Applications of this work exist in audio spatialisation, audio restoration, music content description, effects processing and elsewhere.

[1]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[2]  Patrick Flandrin,et al.  Improving the readability of time-frequency and time-scale representations by the reassignment method , 1995, IEEE Trans. Signal Process..

[3]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[4]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[5]  Antti J. Eronen,et al.  Musical instrument recognition using ICA-based transform of features and discriminatively trained HMMs , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[6]  Anssi Klapuri,et al.  Separation of harmonic sounds using multipitch analysis and iterative parameter estimation , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[7]  Nicola Orio,et al.  Score Following: State of the Art and New Developments , 2003, NIME.

[8]  Emanuele Pollastri,et al.  Musical Instrument Timbres Classification with Spectral Features , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[9]  Robert Strandh,et al.  InSpect and ReSpect: Spectral Modeling, Analysis and Real-time Synthesis Software Tools for Researchers and Composers , 1999, ICMC.

[10]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[11]  Fabien Gouyon,et al.  Automatic labeling of unpitched percussion sounds , 2003 .

[12]  Xavier Serra,et al.  SIMAC: semantic interaction with music audio contents , 2005 .

[13]  Piotr Synak,et al.  Application of Temporal Descriptors to Musical Instrument Sound Recognition , 2003, Journal of Intelligent Information Systems.

[14]  P. Depalle,et al.  Extraction of spectral peak parameters using a short-time Fourier transform modeling and no sidelobe windows , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[15]  A. Cohen,et al.  Wavelets: the mathematical background , 1996, Proc. IEEE.

[16]  Alicja Wieczorkowska Musical Sound Classification based on Wavelet Analysis , 2001, Fundam. Informaticae.

[17]  Josef Kittler,et al.  Fast branch & bound algorithms for optimal feature selection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  David G. Stork,et al.  Pattern Classification , 1973 .

[19]  E. Zwicker,et al.  Analytical expressions for critical‐band rate and critical bandwidth as a function of frequency , 1980 .

[20]  Jordi Bonada,et al.  Vibrato Extraction and Parameterization in the Spectral Modeling Synthesis framework , 1998 .

[21]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[22]  Gianpaolo Evangelista,et al.  Analysis and Synthesis of Pseudo-Periodic-Like Noise by Means of Wavelets with Applications to Digital Audio , 2001, EURASIP J. Adv. Signal Process..

[23]  J. Grey Multidimensional perceptual scaling of musical timbres. , 1977, The Journal of the Acoustical Society of America.

[24]  K. Yamashita,et al.  Correction to "Nonstationary Noise Estimation Using Low-Frequency Regions for Spectral Subtraction" , 2005, IEEE Signal Process. Lett..

[25]  Manuel Rosa-Zurera,et al.  Transient modeling by matching pursuits with a wavelet dictionary for parametric audio coding , 2004, IEEE Signal Processing Letters.

[26]  Keld K. Jensen,et al.  Timbre Models of Musical Sounds , 1999 .

[27]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[28]  Suzanne Winsberg,et al.  A latent class approach to fitting the weighted Euclidean model, clascal , 1993 .

[29]  Stephen Handel,et al.  Sound Source Identification: The Possible Role of Timbre Transformations , 2004 .

[30]  Ahmed H. Tewfik,et al.  Low bit rate high quality audio coding with combined harmonic and wavelet representations , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[31]  Xavier Rodet,et al.  Tracking of partials for additive sound synthesis using hidden Markov models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[32]  Julius O. Smith,et al.  Audio representations for data compression and compressed domain processing , 1998 .

[33]  Ed F. Deprettere,et al.  Robust exponential modeling of audio signals , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[34]  Michael M. Goodwin Residual modeling in music analysis-synthesis , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[35]  P. Gonçalves,et al.  Time - frequency toolbox for use with MATHLAB , 1997 .

[36]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[37]  Anssi Klapuri,et al.  Multipitch estimation and sound separation by the spectral smoothness principle , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[38]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  K. Berger Some Factors in the Recognition of Timbre , 1964 .

[40]  Reinier Plomp,et al.  Aspects of tone sensation : a psychophysical study , 1976 .

[41]  Nishan Canagarajah,et al.  A review of time–frequency representations, with application to sound/music analysis–resynthesis , 1997 .

[42]  Alenka Kavcic,et al.  Neural Networks for Note Onset Detection in Piano Music , 2002 .

[43]  J. W. Gordon,et al.  Perceptual effects of spectral modifications on musical timbres , 1978 .

[44]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[45]  Karim Abed-Meraim,et al.  Damped and delayed sinusoidal model for transient signals , 2005, IEEE Transactions on Signal Processing.

[46]  Anssi Klapuri,et al.  Signal Processing Methods for the Automatic Transcription of Music , 2004 .

[47]  J. Grey Timbre discrimination in musical patterns. , 1978, The Journal of the Acoustical Society of America.

[48]  S. Godsill,et al.  Multi-Gabor dictionaries for audio time-frequency analysis , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[49]  Hans-Günter Hirsch,et al.  Noise estimation techniques for robust speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[50]  Simon J. Godsill,et al.  BAYESIAN MODELLING OF HARMONIC SIGNALS FOR POLYPHONIC MUSIC TRACKING , 1999 .

[51]  Derry Fitzgerald,et al.  Automatic Drum Transcription and Source Separation , 2004 .

[52]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[53]  Nicola Orio,et al.  Robust Polyphonic Midi Score Following with Hidden Markov Models , 2004, ICMC.

[54]  Kelly Fitz,et al.  Correction to: 'On the Use of Time/Frequency Reassignment in Additive Sound Modeling' , 2002 .

[55]  Petre Stoica,et al.  Introduction to spectral analysis , 1997 .

[56]  J. Bretos,et al.  Measurement of vibrato in lyric singers , 2001, IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics (Cat. No.01CH 37188).

[57]  Ichiro Fujinaga,et al.  Realtime Recognition of Orchestral Instruments , 2000, International Conference on Mathematics and Computing.

[58]  M. Davies,et al.  A COMPARISON BETWEEN FIXED AND MULTIRESOLUTION ANALYSIS FOR ONSET DETECTION IN MUSICAL SIGNALS , 2004 .

[59]  Keith D. Martin,et al.  A Blackboard System for Automatic Transcription of Simple Polyphonic Music , 1996 .

[60]  Roger A. Kendall,et al.  The role of acoustic signal partitions in listener categorization of musical phrases. , 1986 .

[61]  Edward A. Lee,et al.  Adaptive Signal Models: Theory, Algorithms, and Audio Applications , 1998 .

[62]  Richard Kronland-Martinet,et al.  The Wavelet Transform for Analysis, Synthesis, and Processing of Speech and Music Sounds , 1988 .

[63]  Mike E. Davies,et al.  SEPARATION OF TRANSIENT INFORMATION IN MUSICAL AUDIO USING MULTIRESOLUTION ANALYSIS TECHNIQUES , 2001 .

[64]  Perfecto Herrera-Boyer,et al.  Automatic Classification of Musical Instrument Sounds , 2003 .

[65]  F. Harris On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.

[66]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[67]  M. Davies,et al.  Complex domain onset detection for musical signals , 2003 .

[68]  Fabien Gouyon,et al.  A Flexible Analysis-Synthesis Method for Transients , 2000, ICMC.

[69]  Hwai-Tsu Hu,et al.  Supplementary schemes to spectral subtraction for speech enhancement , 2002, Speech Commun..

[70]  S. S. Stevens Frequency Analysis and Periodicity Detection in Hearing. , 1972 .

[71]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[72]  Anssi Klapuri,et al.  AUTOMATIC TRANSCRIPTION OF MUSIC , 2003 .

[73]  George Tzanetakis,et al.  Audio Analysis using the Discrete Wavelet Transform , 2001 .

[74]  Andrew Sterian,et al.  Model-based approach to partial tracking for musical transcription , 1998, Optics & Photonics.

[75]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[76]  Patrick Susini,et al.  Validation of a multidimensional distance model for perceptual dissimilarities among musical timbres , 1998 .

[77]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[78]  Yves Meyer Wavelets - algorithms & applications , 1993 .

[79]  J. Gibson The Senses Considered As Perceptual Systems , 1967 .

[80]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[81]  L. Cohen,et al.  Time-frequency distributions-a review , 1989, Proc. IEEE.

[82]  Daniel Patrick Whittlesey Ellis,et al.  Prediction-driven computational auditory scene analysis , 1996 .

[83]  Bozena Kostek,et al.  Musical instrument classification and duet analysis employing music information retrieval techniques , 2004, Proceedings of the IEEE.

[84]  Francois Thibault,et al.  High-level Control of Singing Voice Timbre Transformations , 2004 .

[85]  S. Lakatos A common perceptual space for harmonic and percussive timbres , 2000, Perception & psychophysics.

[86]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[87]  J. A. Conklin Generation of partials due to nonlinear mixing in a stringed instrument , 1999 .

[88]  Anssi Klapuri,et al.  Robust Multipitch Estimation for the Analysis and Manipulation of Polyphonic Musical Signals , 2000 .

[89]  Karim Abed-Meraim,et al.  Audio modeling based on delayed sinusoids , 2004, IEEE Transactions on Speech and Audio Processing.

[90]  Diemo Schwarz,et al.  Spectral Envelopes in Sound Analysis and Synthesis , 1998 .

[91]  H. Traunmüller Analytical expressions for the tonotopic sensory scale , 1990 .

[92]  L. Wedin,et al.  Dimension analysis of the perception of instrumental timbre. , 1972, Scandinavian journal of psychology.

[93]  Xavier Rodet,et al.  Improving polyphonic and poly-instrumental music to score alignment , 2003, ISMIR.

[94]  Lippold Haken,et al.  A New Algorithm for Bandwidth Association in Bandwidth-Enhanced Additive Sound Modeling , 2000, ICMC.

[95]  Barry Vercoe,et al.  Music-listening systems , 2000 .

[96]  Mark B. Sandler,et al.  Classification of audio signals using statistical features on time and wavelet transform domains , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[97]  Melville Clark,et al.  A Preliminary Experiment on the Perceptual Basis for Musical Instrument Families , 1964 .

[98]  Simon Dixon,et al.  LIVE TRACKING OF MUSICAL PERFORMANCES USING ON-LINE TIME WARPING , 2005 .

[99]  Mathieu Lagrange,et al.  Enhanced Partial Tracking Using Linear Prediction , 2003 .

[100]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[101]  Tadeusz Czaszejko,et al.  Automatic Recognition of Isolated Monophonic Musical Instrument Sounds using kNNC , 2005, Journal of Intelligent Information Systems.

[102]  Paul J. Walmsley,et al.  Signal separation of musical instruments: simulation-based methods for musical signal decomposition and transcription , 2001 .

[103]  R. Plomp,et al.  Effect of phase on the timbre of complex tones. , 1969, The Journal of the Acoustical Society of America.

[104]  Mark D. Plumbley,et al.  Polyphonic music transcription by non-negative sparse coding of power spectra , 2004 .

[105]  Sören Nielzén,et al.  Structure and perception of electroacoustic sound and music : proceedings of the Marcus Wallenberg Symposium held in Lund, Sweden, on 21-28 August 1988 , 1989 .

[106]  R J Zatorre,et al.  Multidimensional scaling of synthetic musical timbre: perception of spectral and temporal characteristics. , 1997, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[107]  Tero Tolonen Methods for Separation of Harmonic Sound Sources Using Sinusoidal Modeling , 1999 .

[108]  Kunio Kashino,et al.  A sound source identification system for ensemble music based on template adaptation and music stream extraction , 1999, Speech Commun..

[109]  Tero Tolonen,et al.  Object-Based Sound Source Modeling for Musical Signals , 2000 .

[110]  Simon J. Godsill,et al.  Audio Signal Processing Using Complex Wavelets , 2003 .

[111]  Özgür Yilmaz,et al.  Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[112]  Youngmoo E. Kim,et al.  Musical instrument identification: A pattern‐recognition approach , 1998 .

[113]  T. W. Parsons Separation of speech from interfering speech by means of harmonic selection , 1976 .

[114]  Mark B. Sandler,et al.  On the use of phase and energy for musical onset detection in the complex domain , 2004, IEEE Signal Processing Letters.

[115]  Antti Eronen,et al.  Comparison of features for musical instrument recognition , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[116]  S. Mallat A wavelet tour of signal processing , 1998 .

[117]  W. J. Pielemeier,et al.  Time-frequency analysis of musical signals , 1996, Proc. IEEE.

[118]  Anssi Klapuri,et al.  Separation of harmonic sounds using linear models for the overtone series , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[119]  A. Klapuri,et al.  MUSICAL METER ESTIMATION AND MUSIC TRANSCRIPTION , 2000 .

[120]  Ning Hu,et al.  Polyphonic Audio Matching for Score Following and Intelligent Audio Editors , 2003, ICMC.

[121]  Yinong Ding,et al.  Processing of Musical Tones Using a Combined Quadratic Polynomial-Phase Sinusoid and Residual (QUASAR) Signal Model , 1997 .

[122]  Matija Marolt Networks of Adaptive Oscillators for Partial Tracking and Transcription of Music Recordings , 2004 .

[123]  Julius O. Smith,et al.  Spectral modeling synthesis: A sound analysis/synthesis based on a deterministic plus stochastic decomposition , 1990 .

[124]  Harald Viste,et al.  A method for separation of overlapping partials based on similarity of temporal envelopes in multichannel mixtures , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[125]  Florian Keiler,et al.  SURVEY ON EXTRACTION OF SINUSOIDS IN STATIONARY SOUNDS , 2002 .

[126]  Teresa H. Y. Meng,et al.  Sinusoidal modeling using frame-based perceptually weighted matching pursuits , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[127]  Jośe R. Beltŕan,et al.  ADDITIVE SYNTHESIS BASED ON THE CONTINUOUS WAVELET TRANSFORM: A SINUSOIDAL PLUS TRANSIENT MODEL , 2003 .

[128]  Alexander Fischer,et al.  Quantile based noise estimation for spectral subtraction and Wiener filtering , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[129]  Ian Kaminskyj,et al.  Automatic source identification of monophonic musical instrument sounds , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[130]  Ronald E. Crochiere,et al.  A weighted overlap-add method of short-time Fourier analysis/Synthesis , 1980 .

[131]  Volume Assp,et al.  ACOUSTICS. SPEECH. AND SIGNAL PROCESSING , 1983 .

[132]  Mark R. Every,et al.  Separation of synchronous pitched notes by spectral filtering of harmonics , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[133]  Andrzej Czyzewski,et al.  Representing Musical Instrument Sounds for Their Automatic Classification , 2001 .

[134]  Mikio Tohyama,et al.  Signal Representation Including Waveform Envelope by Clustered Line-Spectrum Modeling , 2003 .

[135]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[136]  S. McAdams,et al.  Acoustic correlates of timbre space dimensions: a confirmatory study using synthetic tones. , 2005, The Journal of the Acoustical Society of America.

[137]  Xavier Rodet,et al.  Detection and modeling of fast attack transients , 2001, ICMC.

[138]  J. Grey,et al.  Perceptual evaluations of synthesized musical instrument tones , 1977 .

[139]  Juan Carlos,et al.  Review of "Discrete-Time Speech Signal Processing - Principles and Practice", by Thomas Quatieri, Prentice-Hall, 2001 , 2003 .

[140]  J. F. Corso,et al.  Timbre Cues and the Identification of Musical Instruments , 1962 .

[141]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[142]  Xavier Serra,et al.  A system for sound analysis/transformation/synthesis based on a deterministic plus stochastic decomposition , 1989 .

[143]  Guy J. Brown,et al.  A blackboard architecture for computational auditory scene analysis , 1999, Speech Commun..

[144]  C. Krumhansl,et al.  Isolating the dynamic attributes of musical timbre. , 1993, The Journal of the Acoustical Society of America.

[145]  Miller Puckette,et al.  Accuracy of frequency estimates using the phase vocoder , 1998, IEEE Trans. Speech Audio Process..

[146]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[147]  S. Schwerman,et al.  The Physics of Musical Instruments , 1991 .

[148]  M. Vetterli,et al.  Wavelets, subband coding, and best bases , 1996, Proc. IEEE.

[149]  Stephen McAdams,et al.  Instrument Sound Description in the Context of MPEG-7 , 2000, ICMC.

[150]  G. Soete,et al.  Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes , 1995, Psychological research.

[151]  Mark D. Plumbley,et al.  Polyphonic transcription by non-negative sparse coding of power spectra , 2004, ISMIR.

[152]  Teresa H. Y. Meng,et al.  Extending Spectral Modeling Synthesis with Transient Modeling Synthesis , 2000, Computer Music Journal.

[153]  Michael M. Goodwin Multiscale overlap-add sinusoidal modeling using matching pursuit and refinements , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[154]  Nick Collins A Comparison of Sound Onset Detection Algorithms with Emphasis on Psychoacoustically Motivated Detection Functions , 2005 .

[155]  J. Smith,et al.  A Sound Decomposition System Based on a Deterministic plus Residual Model , 1990 .

[156]  Xavier Rodet,et al.  Musical Sound Signal Analysis/Synthesis: Sinusoidal+Residual and Elementary Waveform Models , 1997 .

[157]  J C Brown Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. , 1999, The Journal of the Acoustical Society of America.

[158]  Harald Viste,et al.  Separation of harmonic instruments with overlapping partials in multi-channel mixtures , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[159]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[160]  Juan Pablo,et al.  Towards the automated analysis of simple polyphonic music : a knowledge-based approach , 2003 .

[161]  Pedro J. Moreno,et al.  A Study of Musical Instrument Classification Using Gaussian Mixture Models and Support Vector Machines , 1999 .

[162]  Gianpaolo Evangelista,et al.  Pitch-synchronous wavelet representations of speech and music signals , 1993, IEEE Trans. Signal Process..

[163]  Pavel Pudil,et al.  Novel Methods for Subset Selection with Respect to Problem Knowledge , 1998, IEEE Intell. Syst..

[164]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[165]  Kunio Kashino,et al.  Application of the Bayesian probability network to music scene analysis , 1998 .

[166]  Xavier Rodet,et al.  Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components , 1998 .

[167]  Andrew D. Sterian,et al.  Model-based segmentation of time-frequency images for musical transcription. , 1999 .

[168]  Robert C. Maher,et al.  Evaluation of a method for separating digitized duet signals , 1990 .

[169]  P. E. Stopp Frequency analysis and periodicity detection in hearing 1971, Plomp and Smoorenburg (Editors). Leiden, Netherlands: Sijthoff Leiden. Cloth, Fl. 60 , 1971 .

[170]  Mark R. Every SEPARATING HARMONIC AND INHARMONIC NOTE CONTENT FROM REAL MONO RECORDINGS , 2005 .

[171]  Mark Sandler,et al.  DIGITAL AUDIO EFFECTS IN THE WAVELET DOMAIN , 2002 .

[172]  S. Mallat Multiresolution approximations and wavelet orthonormal bases of L^2(R) , 1989 .

[173]  Keith Dana Martin,et al.  Sound-source recognition: a theory and computational model , 1999 .

[174]  Ian Kaminskyj Multi-feature musical instrument sound classifer with user determined generalisation performance , 2002 .

[175]  Mathieu Lagrange,et al.  Sinusoidal Parameter Extraction and Component Selection in a Non Stationary Model , 2002 .

[176]  Roland Wismüller,et al.  European Projects , 2000, Euro-Par.

[177]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[178]  Nicola Laurenti A METHOD FOR SPECTRUM SEPARATION AND ENVELOPE ESTIMATION OF THE RESIDUAL IN SPECTRUM MODELING OF MUSICAL SOUND , 2000 .

[179]  M. Portnoff,et al.  Implementation of the digital phase vocoder using the fast Fourier transform , 1976 .

[180]  Y. Venkataramani,et al.  Perceptual Audio Coding Using Sinusoidal/Optimum Wavelet Representation , 2002 .

[181]  X. Rodet,et al.  Vibrato : detection , estimation , extraction , modi cation , 1999 .

[182]  Xavier Serra,et al.  Musical Sound Modeling with Sinusoids plus Noise , 1997 .

[183]  W. M. Carey,et al.  Digital spectral analysis: with applications , 1986 .

[184]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[185]  R. Vafin,et al.  Sinusoidal modeling using psychoacoustic-adaptive matching pursuits , 2002, IEEE Signal Processing Letters.

[186]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[187]  O. Rioul,et al.  Wavelets and signal processing , 1991, IEEE Signal Processing Magazine.

[188]  Sylvain Marchand,et al.  High-Precision Fourier Analysis of Sounds Using Signal Derivatives , 2000 .

[189]  J C Brown,et al.  Feature dependence in the automatic identification of musical woodwind instruments. , 2001, The Journal of the Acoustical Society of America.

[190]  Thomas F. Quatieri,et al.  An approach to co-channel talker interference suppression using a sinusoidal model for speech , 1990, IEEE Trans. Acoust. Speech Signal Process..

[191]  Friedrich Jondral,et al.  Classification of transient time-varying signals using DFT and wavelet packet based methods , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[192]  David Wessel,et al.  Timbre Space as a Musical Control Structure , 1979 .

[193]  Nicola Orio,et al.  Alignment of Monophonic and Polyphonic Music to a Score , 2001, ICMC.

[194]  S. McAdams,et al.  Discrimination of musical instrument sounds resynthesized with simplified spectrotemporal parameters. , 1999, The Journal of the Acoustical Society of America.

[195]  Anssi Klapuri,et al.  Separation of harmonic sound sources using sinusoidal modeling , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[196]  Daniel P. W. Ellis,et al.  Ground-truth transcriptions of real music from force-aligned MIDI syntheses , 2003, ISMIR.

[198]  J. L. Flanagan,et al.  PHASE VOCODER , 2008 .

[199]  Judith C. Brown,et al.  An efficient algorithm for the calculation of a constant Q transform , 1992 .

[200]  Dottorato Di Ricerca,et al.  COMPUTATIONAL ISSUES IN PHYSICALLY-BASED SOUND MODELS , 2001 .

[201]  Thomas Haenselmann,et al.  A WAVELET BASED AUDIO DENOISER , 2001 .

[202]  E. Carterette,et al.  Perceptual space for musical structures. , 1974, The Journal of the Acoustical Society of America.

[203]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[204]  Masataka Goto,et al.  Category-level identification of non-registered musical instrument sounds , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[205]  T. Baer,et al.  A pitch-synchronous analysis of hoarseness in running speech. , 1988, The Journal of the Acoustical Society of America.

[206]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[207]  Gianpaolo Evangelista,et al.  Fractal Additive Synthesis via Harmonic-Band Wavelets , 2001, Computer Music Journal.

[208]  Mark D. Plumbley,et al.  Automatic Music Transcription and Audio Source Separation , 2002, Cybern. Syst..

[209]  Y Horii,et al.  Vocal shimmer in sustained phonation. , 1980, Journal of speech and hearing research.