Automatic singer identification in polyphonic music.

[1]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[2]  Khaled H. Hamed,et al.  Time-frequency analysis , 2003 .

[3]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[4]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[5]  Gunnar Fant,et al.  The voice source in connected speech , 1997, Speech Commun..

[6]  Ingo R. Titze,et al.  Principles of voice production , 1994 .

[7]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[8]  Xavier Rodet,et al.  The CHANT Project: From the Synthesis of the Singing Voice to Synthesis in General , 1984 .

[9]  Sun-Yuan Kung,et al.  Channel distortion compensation based on the measurement of handset's frequency responses , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[10]  J. Sundberg Perceptual aspects of singing. , 1994, Journal of voice : official journal of the Voice Foundation.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[13]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[14]  Gregory H. Wakefield,et al.  Signal Analysis of the Singing Voice: Low-Order Representations of Singer Identity , 2000, ICMC.

[15]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[16]  Daniel Patrick Whittlesey Ellis,et al.  Prediction-driven computational auditory scene analysis , 1996 .

[17]  Barry Vercoe,et al.  Music-listening systems , 2000 .

[18]  M. Mellody,et al.  Modal distribution analysis, synthesis, and perception of a soprano's sung vowels. , 2001, Journal of voice : official journal of the Voice Foundation.

[19]  T. Moon,et al.  Mathematical Methods and Algorithms for Signal Processing , 1999 .

[20]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22]  Fernando Pereira,et al.  MPEG-7: A standardised description of audiovisual content , 2000, Signal Process. Image Commun..

[23]  Petre Stoica,et al.  Introduction to spectral analysis , 1997 .

[24]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[25]  J. Sundberg,et al.  The Science of Singing Voice , 1987 .

[26]  R. R. Clarke Transform coding of images , 1985 .

[27]  Steve Lawrence,et al.  Artist detection in music with Minnowmatch , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[28]  H. Strube Linear prediction on a warped frequency scale , 1980 .

[29]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[30]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Myungcheol Lee,et al.  Graph theory for image analysis: an approach based on the shortest spanning tree , 1986 .

[32]  Gregory H. Wakefield,et al.  A high‐resolution time–frequency representation for musical instrument signals , 1996 .

[33]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[34]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[35]  George Tzanetakis,et al.  Multifeature audio segmentation for browsing and annotation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[36]  Unto K. Laine,et al.  A comparison of warped and conventional linear predictive coding , 2001, IEEE Trans. Speech Audio Process..

[37]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[38]  Daniel P. W. Ellis,et al.  Locating singing voice segments within music signals , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[39]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[40]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[41]  I. Jolliffe Principal Component Analysis , 2002 .

[42]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[43]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Carl W. Helstrom,et al.  Elements of signal detection and estimation , 1994 .

[45]  R. P. Ramachandran,et al.  Robust speaker recognition: a feature-based approach , 1996, IEEE Signal Processing Magazine.

[46]  Hui-Ling Lu,et al.  Glottal source modeling for singing voice synthesis , 2000, ICMC.

[47]  E. Prame Measurements of the vibrato rate of ten singers , 1994 .

[48]  B. Atal,et al.  Speech analysis and synthesis by linear prediction of the speech wave. , 1971, The Journal of the Acoustical Society of America.

[49]  Keith Dana Martin,et al.  Sound-source recognition: a theory and computational model , 1999 .

[50]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[51]  Douglas A. Reynolds Experimental evaluation of features for robust speaker identification , 1994, IEEE Trans. Speech Audio Process..

[52]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[53]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[54]  Youngmoo E. Kim A framework for parametric singing voice analysis/synthesis , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[55]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[56]  Perry R. Cook,et al.  SPASM, a Real-Time Vocal Tract Physical Model Controller; and Singer, the Companion Software Synthesis System , 1993 .

[57]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[58]  S. W. Beet,et al.  Visual representations of speech signals , 1993 .

[59]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[60]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .