Detection of speech and music based on spectral tracking
暂无分享,去创建一个
[1] Thomas Sikora,et al. How Efficient is MPEG-7 for General Sound Recognition? , 2004 .
[2] Fabrice Plante,et al. A pitch extraction reference database , 1995, EUROSPEECH.
[3] Guy J. Brown,et al. Computational auditory scene analysis , 1994, Comput. Speech Lang..
[4] Xavier Rodet,et al. Tracking of partials for additive sound synthesis using hidden Markov models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[5] J. E. Jackson. A User's Guide to Principal Components , 1991 .
[6] Carol Y. Espy-Wilson,et al. Knowledge-based analysis of speech mixed with sporadic environmental sounds , 1998 .
[7] Takao Kobayashi,et al. Robust pitch estimation with harmonics enhancement in noisy environments based on instantaneous frequency , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[8] Kathy Melih,et al. Source segmentation for structured audio , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[9] Shuichi Itahashi,et al. JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research , 1999 .
[10] Masaaki Honda,et al. Sinusoidal model based on instantaneous frequency attractors , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Ruben Gonzalez,et al. Techniques for Improving the Accuracy of Sinusoidal Tracking , 2005, EuroIMSA.
[12] Katsuhiko Shirai,et al. Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals , 2005, INTERSPEECH.
[13] Masataka Goto,et al. RWC Music Database: Music genre database and musical instrument sound database , 2003, ISMIR.
[14] Malcolm Slaney,et al. Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[15] Kari Torkkola,et al. Blind Separation For Audio Signals - Are We There Yet? , 1999 .
[16] Kathy Melih,et al. Audio source type segmentation using a perceptually based representation , 1999, ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359).
[17] Masataka Goto,et al. RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.
[18] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[19] Regunathan Radhakrishnan,et al. Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[21] T. Taniguchi,et al. Spectral Frequency Tracking for Classifying Audio Signals , 2006, 2006 IEEE International Symposium on Signal Processing and Information Technology.
[22] Tuomas Virtanen,et al. Sound Source Separation Using Sparse Coding with Temporal Continuity Objective , 2003, ICMC.
[23] Liang Gu,et al. Robust singing detection in speech/music discriminator design , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[24] J. Edward Jackson,et al. A User's Guide to Principal Components. , 1991 .
[25] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..
[26] B. Moore. An Introduction to the Psychology of Hearing , 1977 .
[27] John Saunders,et al. Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[28] R Drullman,et al. Temporal envelope and fine structure cues for speech intelligibility. , 1994, The Journal of the Acoustical Society of America.
[29] Anssi Klapuri,et al. Separation of harmonic sound sources using sinusoidal modeling , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[30] Mikio Tohyama,et al. Signal Representation Including Waveform Envelope by Clustered Line-Spectrum Modeling , 2003 .
[31] Hitoshi Isahara,et al. Spontaneous Speech Corpus of Japanese , 2000, LREC.
[32] B. Moore. An introduction to the psychology of hearing, 3rd ed. , 1989 .