Monaural speech organization and segregation

[1]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[2]  Jose C. Principe,et al.  Neural and adaptive systems , 2000 .

[3]  F. Zeng,et al.  Speech recognition with altered spectral distribution of envelope cues. , 1996, The Journal of the Acoustical Society of America.

[4]  Jan Van der Spiegel,et al.  Acoustic-phonetic features for the automatic classification of stop consonants , 2001, IEEE Trans. Speech Audio Process..

[5]  Mitchel Weintraub,et al.  A theory and computational model of auditory monaural sound separation , 1985 .

[6]  DeLiang Wang,et al.  Separation of stop consonants , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[8]  John H. L. Hansen,et al.  Constrained iterative speech enhancement with application to speech recognition , 1991, IEEE Trans. Signal Process..

[9]  Harvey Fletcher,et al.  Speech and hearing. , 1930, Health services manager.

[10]  Terrence J. Sejnowski,et al.  Blind source separation of more sources than mixtures using overcomplete representations , 1999, IEEE Signal Processing Letters.

[11]  A. M. Mimpen,et al.  The ear as a frequency analyzer. II. , 1964, The Journal of the Acoustical Society of America.

[12]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[13]  Daniel P. W. Ellis,et al.  Decoding speech in the presence of other sources , 2005, Speech Commun..

[14]  DeLiang Wang,et al.  Auditory segmentation based on event detection , 2004, SAPA@INTERSPEECH.

[15]  R. Plomp,et al.  Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[16]  Andrew W. Fitzgibbon,et al.  An Experimental Comparison of Range Image Segmentation Algorithms , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[18]  Hideki Kawahara,et al.  Dynamic sound stream formation based on continuity of spectral change , 1999, Speech Commun..

[19]  Godfrey Dewey,et al.  Relativ frequency of English speech sounds , 1923 .

[20]  A M Ali,et al.  Acoustic-phonetic features for the automatic classification of fricatives. , 2001, The Journal of the Acoustical Society of America.

[21]  Martine Turgeon,et al.  Rhythmic masking release: contribution of cues for perceptual organization to the cross-spectral fusion of concurrent narrow-band noises. , 2002, The Journal of the Acoustical Society of America.

[22]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[23]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[24]  Mototsugu Abe,et al.  Auditory scene analysis based on time-frequency integration of shared FM and AM , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[25]  Guy J. Brown,et al.  A multi-pitch tracking algorithm for noisy speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26]  C. Darwin Perceiving vowels in the presence of another sound: constraints on formant perception. , 1984, The Journal of the Acoustical Society of America.

[27]  DeLiang Wang,et al.  Monaural speech segregation based on pitch tracking and amplitude modulation , 2002, IEEE Transactions on Neural Networks.

[28]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[30]  M. Viberg,et al.  Two decades of array signal processing research: the parametric approach , 1996, IEEE Signal Process. Mag..

[31]  DeLiang Wang,et al.  Model-based sequential organization in cochannel speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[32]  J. Pickles An Introduction to the Physiology of Hearing , 1982 .

[33]  DeLiang Wang,et al.  Separation of fricatives and affricates , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[34]  R. Kumaresan,et al.  Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications , 1999 .

[35]  S. Nooteboom,et al.  THE PROSODY OF SPEECH: MELODY AND RHYTHM , 2001 .

[36]  T. W. Parsons Separation of speech from interfering speech by means of harmonic selection , 1976 .

[37]  Yonghong Yan,et al.  The contribution of consonants versus vowels to word recognition in fluent speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[38]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[39]  David K. Mellinger,et al.  Event formation and separation in musical sound , 1992 .

[40]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[41]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[42]  Masashi Unoki,et al.  A method of signal extraction from noisy signal based on auditory scene analysis , 1997, Speech Commun..

[43]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[44]  Guy J. Brown,et al.  Separation of speech from interfering sounds based on oscillatory correlation , 1999, IEEE Trans. Neural Networks.

[45]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[46]  DeLiang Wang,et al.  On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.

[47]  Aggelos K. Katsaggelos,et al.  Sound source separation via computational auditory scene analysis-enhanced beamforming , 2002, Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2002.

[48]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[49]  Richard Lippmann,et al.  Speech recognition by machines and humans , 1997, Speech Commun..

[50]  R. Patterson,et al.  Complex Sounds and Auditory Images , 1992 .

[51]  Sam T. Roweis,et al.  One Microphone Source Separation , 2000, NIPS.

[52]  Kuldip K. Paliwal,et al.  A speech enhancement method based on Kalman filtering , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[53]  M C Killion,et al.  Revised estimate of minimum audible pressure: where is the "missing 6 dB"? , 1978, The Journal of the Acoustical Society of America.

[54]  E. Owens,et al.  An Introduction to the Psychology of Hearing , 1997 .

[55]  DeLiang Wang,et al.  Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[56]  Guy J. Brown,et al.  Techniques for handling convolutional distortion with 'missing data' automatic speech recognition , 2004, Speech Commun..

[57]  Hynek Hermansky,et al.  Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[58]  Peter S Chang,et al.  Exploration of Behavioral, Physiological, and Computational Approaches to Auditory Scene Analysis , 2004 .

[59]  David F. Rosenthal,et al.  Computational auditory scene analysis , 1998 .

[60]  DeLiang Wang,et al.  Auditory Segmentation Based on Onset and Offset Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[61]  Paul C. Bagshaw,et al.  Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching , 1993, EUROSPEECH.

[62]  C. Darwin Auditory grouping , 1997, Trends in Cognitive Sciences.

[63]  P. N. Denbigh,et al.  A sound segregation algorithm for reverberant conditions , 2001, Speech Commun..

[64]  DeLiang Wang,et al.  Separation of singing voice from music accompaniment for monaural recordings , 2007 .

[65]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[66]  J. Bird Effects of a difference in fundamental frequency in separating two sentences. , 1997 .

[67]  Guy J. Brown,et al.  Separation of Speech by Computational Auditory Scene Analysis , 2005 .

[68]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[69]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[70]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[71]  DeLiang Wang,et al.  Unvoiced Speech Segregation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[72]  Oded Ghitza Auditory models and human performance in tasks related to speech coding and speech recognition , 1994 .

[73]  DeLiang Wang,et al.  A schema-based model for phonemic restoration , 2005, Speech Commun..

[74]  B C Wheeler,et al.  A two-microphone dual delay-line approach for extraction of a speech sound in the presence of multiple interferers. , 2001, The Journal of the Acoustical Society of America.

[75]  Steven Greenberg,et al.  Robust speech recognition using the modulation spectrogram , 1998, Speech Commun..

[76]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[77]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .

[78]  R. Carlyon,et al.  Comparing the fundamental frequencies of resolved and unresolved harmonics: Evidence for two pitch mechanisms? , 1994 .

[79]  DeLiang Wang,et al.  Speech segregation based on pitch tracking and amplitude modulation , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[80]  Biing-Hwang Juang,et al.  On the application of hidden Markov models for enhancing noisy speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[81]  Richard Lippmann,et al.  A comparison of signal processing front ends for automatic word recognition , 1995, IEEE Trans. Speech Audio Process..

[82]  Gerhard Schmidt,et al.  An Auditory Scene Analysis Approach to Monaural Speech Segregation , 2006 .

[83]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[84]  Richard F. Lyon,et al.  Computational models of neural auditory processing , 1984, ICASSP.

[85]  Jean Rouat,et al.  A pitch determination and voiced/unvoiced decision algorithm for noisy speech , 1995, Speech Commun..

[86]  Martin Cooke,et al.  Modelling auditory processing and organisation , 1993, Distinguished dissertations in computer science.

[87]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[88]  Ruth Y Litovsky,et al.  The benefit of binaural hearing in a cocktail party: effect of location and type of interferer. , 2004, The Journal of the Acoustical Society of America.

[89]  Azar Khurshid PITCH ESTIMATION FOR NOISY SPEECH , 2002 .

[90]  R Meddis,et al.  Modeling the identification of concurrent vowels with different fundamental frequencies. , 1992, The Journal of the Acoustical Society of America.

[91]  Y. Ephraim,et al.  A Brief Survey of Speech Enhancement , 2003 .

[92]  Hamid Sheikhzadeh,et al.  HMM-based strategies for enhancement of speech signals embedded in nonstationary noise , 1998, IEEE Trans. Speech Audio Process..

[93]  Sam T. Roweis,et al.  Factorial models and refiltering for speech separation and denoising , 2003, INTERSPEECH.

[94]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[95]  R Meddis,et al.  Simulation of auditory-neural transduction: further studies. , 1988, The Journal of the Acoustical Society of America.

[96]  Q. Summerfield,et al.  Modeling the perception of concurrent vowels: vowels with different fundamental frequencies. , 1990, The Journal of the Acoustical Society of America.

[97]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[98]  Te-Won Lee,et al.  A Maximum Likelihood Approach to Single-channel Source Separation , 2003, J. Mach. Learn. Res..

[99]  Daniel Patrick Whittlesey Ellis,et al.  Prediction-driven computational auditory scene analysis , 1996 .

[100]  John H. L. Hansen,et al.  Speech enhancement using a constrained iterative sinusoidal model , 2001, IEEE Trans. Speech Audio Process..

[101]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[102]  Roger K. Moore,et al.  Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[103]  Steven Greenberg,et al.  INSIGHTS INTO SPOKEN LANGUAGE GLEANED FROM PHONETIC TRANSCRIPTION OF THE SWITCHBOARD CORPUS , 1996 .

[104]  Richard F. Lyon,et al.  A perceptual pitch detector , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[105]  Tomohiro Nakatani,et al.  Harmonic sound stream segregation using localization and its application to speech stream segregation , 1999, Speech Commun..

[106]  Guy J. Brown,et al.  Modelling the perceptual segregation of double vowels with a network of neural oscillators , 1997, Neural Networks.

[107]  A. Cheveigné Concurrent vowel identification. III. A neural model of harmonic interference cancellation , 1997 .

[108]  Boualem Boashash,et al.  Estimating and interpreting the instantaneous frequency of a signal. II. A/lgorithms and applications , 1992, Proc. IEEE.

[109]  DeLiang Wang,et al.  A pitch-based model for separation of reverberant speech , 2005, INTERSPEECH.

[110]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.