论文信息 - Monaural speech organization and segregation

Monaural speech organization and segregation

[1] Ehud Weinstein,et al. Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[2] Jose C. Principe,et al. Neural and adaptive systems , 2000 .

[3] F. Zeng,et al. Speech recognition with altered spectral distribution of envelope cues. , 1996, The Journal of the Acoustical Society of America.

[4] Jan Van der Spiegel,et al. Acoustic-phonetic features for the automatic classification of stop consonants , 2001, IEEE Trans. Speech Audio Process..

[5] Mitchel Weintraub,et al. A theory and computational model of auditory monaural sound separation , 1985 .

[6] DeLiang Wang,et al. Separation of stop consonants , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7] Guy J. Brown,et al. Computational auditory scene analysis , 1994, Comput. Speech Lang..

[8] John H. L. Hansen,et al. Constrained iterative speech enhancement with application to speech recognition , 1991, IEEE Trans. Signal Process..

[9] Harvey Fletcher,et al. Speech and hearing. , 1930, Health services manager.

[10] Terrence J. Sejnowski,et al. Blind source separation of more sources than mixtures using overcomplete representations , 1999, IEEE Signal Processing Letters.

[11] A. M. Mimpen,et al. The ear as a frequency analyzer. II. , 1964, The Journal of the Acoustical Society of America.

[12] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[13] Daniel P. W. Ellis,et al. Decoding speech in the presence of other sources , 2005, Speech Commun..

[14] DeLiang Wang,et al. Auditory segmentation based on event detection , 2004, SAPA@INTERSPEECH.

[15] R. Plomp,et al. Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[16] Andrew W. Fitzgibbon,et al. An Experimental Comparison of Range Image Segmentation Algorithms , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[17] Van Nostrand,et al. Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[18] Hideki Kawahara,et al. Dynamic sound stream formation based on continuity of spectral change , 1999, Speech Commun..

[19] Godfrey Dewey,et al. Relativ frequency of English speech sounds , 1923 .

[20] A M Ali,et al. Acoustic-phonetic features for the automatic classification of fricatives. , 2001, The Journal of the Acoustical Society of America.

[21] Martine Turgeon,et al. Rhythmic masking release: contribution of cues for perceptual organization to the cross-spectral fusion of concurrent narrow-band noises. , 2002, The Journal of the Acoustical Society of America.

[22] Vladimir Vapnik,et al. The Nature of Statistical Learning , 1995 .

[23] Guy J. Brown,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[24] Mototsugu Abe,et al. Auditory scene analysis based on time-frequency integration of shared FM and AM , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[25] Guy J. Brown,et al. A multi-pitch tracking algorithm for noisy speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26] C. Darwin. Perceiving vowels in the presence of another sound: constraints on formant perception. , 1984, The Journal of the Acoustical Society of America.

[27] DeLiang Wang,et al. Monaural speech segregation based on pitch tracking and amplitude modulation , 2002, IEEE Transactions on Neural Networks.

[28] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Nathalie Virag,et al. Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[30] M. Viberg,et al. Two decades of array signal processing research: the parametric approach , 1996, IEEE Signal Process. Mag..

[31] DeLiang Wang,et al. Model-based sequential organization in cochannel speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[32] J. Pickles. An Introduction to the Physiology of Hearing , 1982 .

[33] DeLiang Wang,et al. Separation of fricatives and affricates , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[34] R. Kumaresan,et al. Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications , 1999 .

[35] S. Nooteboom,et al. THE PROSODY OF SPEECH: MELODY AND RHYTHM , 2001 .

[36] T. W. Parsons. Separation of speech from interfering speech by means of harmonic selection , 1976 .

[37] Yonghong Yan,et al. The contribution of consonants versus vowels to word recognition in fluent speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[38] R. Plomp,et al. Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[39] David K. Mellinger,et al. Event formation and separation in musical sound , 1992 .

[40] Anssi Klapuri,et al. Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[41] Tony Lindeberg,et al. Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[42] Masashi Unoki,et al. A method of signal extraction from noisy signal based on auditory scene analysis , 1997, Speech Commun..

[43] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[44] Guy J. Brown,et al. Separation of speech from interfering sounds based on oscillatory correlation , 1999, IEEE Trans. Neural Networks.

[45] John H. L. Hansen,et al. Discrete-Time Processing of Speech Signals , 1993 .

[46] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.

[47] Aggelos K. Katsaggelos,et al. Sound source separation via computational auditory scene analysis-enhanced beamforming , 2002, Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2002.

[48] Brian R Glasberg,et al. Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[49] Richard Lippmann,et al. Speech recognition by machines and humans , 1997, Speech Commun..

[50] R. Patterson,et al. Complex Sounds and Auditory Images , 1992 .

[51] Sam T. Roweis,et al. One Microphone Source Separation , 2000, NIPS.

[52] Kuldip K. Paliwal,et al. A speech enhancement method based on Kalman filtering , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[53] M C Killion,et al. Revised estimate of minimum audible pressure: where is the "missing 6 dB"? , 1978, The Journal of the Acoustical Society of America.

[54] E. Owens,et al. An Introduction to the Psychology of Hearing , 1997 .

[55] DeLiang Wang,et al. Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[56] Guy J. Brown,et al. Techniques for handling convolutional distortion with 'missing data' automatic speech recognition , 2004, Speech Commun..

[57] Hynek Hermansky,et al. Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[58] Peter S Chang,et al. Exploration of Behavioral, Physiological, and Computational Approaches to Auditory Scene Analysis , 2004 .

[59] David F. Rosenthal,et al. Computational auditory scene analysis , 1998 .

[60] DeLiang Wang,et al. Auditory Segmentation Based on Onset and Offset Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[61] Paul C. Bagshaw,et al. Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching , 1993, EUROSPEECH.

[62] C. Darwin. Auditory grouping , 1997, Trends in Cognitive Sciences.

[63] P. N. Denbigh,et al. A sound segregation algorithm for reverberant conditions , 2001, Speech Commun..

[64] DeLiang Wang,et al. Separation of singing voice from music accompaniment for monaural recordings , 2007 .

[65] John Scott Bridle,et al. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[66] J. Bird. Effects of a difference in fundamental frequency in separating two sentences. , 1997 .

[67] Guy J. Brown,et al. Separation of Speech by Computational Auditory Scene Analysis , 2005 .

[68] Jean Ponce,et al. Computer Vision: A Modern Approach , 2002 .

[69] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[70] Saeed Gazor,et al. An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[71] DeLiang Wang,et al. Unvoiced Speech Segregation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[72] Oded Ghitza. Auditory models and human performance in tasks related to speech coding and speech recognition , 1994 .

[73] DeLiang Wang,et al. A schema-based model for phonemic restoration , 2005, Speech Commun..

[74] B C Wheeler,et al. A two-microphone dual delay-line approach for extraction of a speech sound in the presence of multiple interferers. , 2001, The Journal of the Acoustical Society of America.

[75] Steven Greenberg,et al. Robust speech recognition using the modulation spectrogram , 1998, Speech Commun..

[76] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[77] Wolfgang Hess,et al. Pitch Determination of Speech Signals , 1983 .

[78] R. Carlyon,et al. Comparing the fundamental frequencies of resolved and unresolved harmonics: Evidence for two pitch mechanisms? , 1994 .

[79] DeLiang Wang,et al. Speech segregation based on pitch tracking and amplitude modulation , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[80] Biing-Hwang Juang,et al. On the application of hidden Markov models for enhancing noisy speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[81] Richard Lippmann,et al. A comparison of signal processing front ends for automatic word recognition , 1995, IEEE Trans. Speech Audio Process..

[82] Gerhard Schmidt,et al. An Auditory Scene Analysis Approach to Monaural Speech Segregation , 2006 .

[83] Phil D. Green,et al. Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[84] Richard F. Lyon,et al. Computational models of neural auditory processing , 1984, ICASSP.

[85] Jean Rouat,et al. A pitch determination and voiced/unvoiced decision algorithm for noisy speech , 1995, Speech Commun..

[86] Martin Cooke,et al. Modelling auditory processing and organisation , 1993, Distinguished dissertations in computer science.

[87] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .

[88] Ruth Y Litovsky,et al. The benefit of binaural hearing in a cocktail party: effect of location and type of interferer. , 2004, The Journal of the Acoustical Society of America.

[89] Azar Khurshid. PITCH ESTIMATION FOR NOISY SPEECH , 2002 .

[90] R Meddis,et al. Modeling the identification of concurrent vowels with different fundamental frequencies. , 1992, The Journal of the Acoustical Society of America.

[91] Y. Ephraim,et al. A Brief Survey of Speech Enhancement , 2003 .

[92] Hamid Sheikhzadeh,et al. HMM-based strategies for enhancement of speech signals embedded in nonstationary noise , 1998, IEEE Trans. Speech Audio Process..

[93] Sam T. Roweis,et al. Factorial models and refiltering for speech separation and denoising , 2003, INTERSPEECH.

[94] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[95] R Meddis,et al. Simulation of auditory-neural transduction: further studies. , 1988, The Journal of the Acoustical Society of America.

[96] Q. Summerfield,et al. Modeling the perception of concurrent vowels: vowels with different fundamental frequencies. , 1990, The Journal of the Acoustical Society of America.

[97] R. McAulay,et al. Speech enhancement using a soft-decision noise suppression filter , 1980 .

[98] Te-Won Lee,et al. A Maximum Likelihood Approach to Single-channel Source Separation , 2003, J. Mach. Learn. Res..

[99] Daniel Patrick Whittlesey Ellis,et al. Prediction-driven computational auditory scene analysis , 1996 .

[100] John H. L. Hansen,et al. Speech enhancement using a constrained iterative sinusoidal model , 2001, IEEE Trans. Speech Audio Process..

[101] Paul Boersma,et al. Praat: doing phonetics by computer , 2003 .

[102] Roger K. Moore,et al. Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[103] Steven Greenberg,et al. INSIGHTS INTO SPOKEN LANGUAGE GLEANED FROM PHONETIC TRANSCRIPTION OF THE SWITCHBOARD CORPUS , 1996 .

[104] Richard F. Lyon,et al. A perceptual pitch detector , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[105] Tomohiro Nakatani,et al. Harmonic sound stream segregation using localization and its application to speech stream segregation , 1999, Speech Commun..

[106] Guy J. Brown,et al. Modelling the perceptual segregation of double vowels with a network of neural oscillators , 1997, Neural Networks.

[107] A. Cheveigné. Concurrent vowel identification. III. A neural model of harmonic interference cancellation , 1997 .

[108] Boualem Boashash,et al. Estimating and interpreting the instantaneous frequency of a signal. II. A/lgorithms and applications , 1992, Proc. IEEE.

[109] DeLiang Wang,et al. A pitch-based model for separation of reverberant speech , 2005, INTERSPEECH.

[110] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.