Algorithms for Improving Audible Quality and Recognition Accuracy of Noisy Speech

[1]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[2]  G. W. Elko,et al.  An adaptive close-talking microphone array , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[3]  Alejandro Acero,et al.  Acoustical and environmental robustness in automatic speech recognition , 1991 .

[4]  Guy J. Brown,et al.  Speech and crosstalk detection in multichannel audio , 2005, IEEE Transactions on Speech and Audio Processing.

[5]  Min-Seok Choi,et al.  An improved estimation of a priori speech absence probability for speech enhancement: in perspective of speech perception , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6]  Bhaskar D. Rao,et al.  All-pole modeling of speech based on the minimum variance distortionless response spectrum , 2000, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[7]  Veronique Stouten,et al.  Robust Automatic Speech Recognition in Time-Varying Environments (Robuuste automatische spraakherkenning in een tijdsvariërende omgeving) , 2006 .

[8]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[9]  Simon J. Godsill,et al.  Efficient Alternatives to the Ephraim and Malah Suppression Rule for Audio Signal Enhancement , 2003, EURASIP J. Adv. Signal Process..

[10]  L. R. Rabiner,et al.  On the application of energy contours to the recognition of connected word sequences , 1984, AT&T Bell Laboratories Technical Journal.

[11]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[12]  Francesco Piazza,et al.  Keyword spotting based system for conversation fostering in tabletop scenarios: Preliminary evaluation , 2009, 2009 2nd Conference on Human System Interactions.

[13]  A.V. Oppenheim,et al.  Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.

[14]  B. Atal Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.

[15]  S. Gannot,et al.  Speech enhancement based on the general transfer function GSC and postfiltering , 2004, IEEE Trans. Speech Audio Process..

[16]  J. Makhoul Spectral analysis of speech by linear prediction , 1973 .

[17]  José L. Pérez-Córdoba,et al.  Histogram equalization of speech representation for robust speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[18]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[19]  Hermann Ney,et al.  Quantile based histogram equalization for noise robust speech recognition , 2001, INTERSPEECH.

[20]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[21]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[22]  Douglas D. O'Shaughnessy Speech Communications: Human and Machine , 2012 .

[23]  B.D. Van Veen,et al.  Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[24]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[25]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[26]  Bin Chen,et al.  A Laplacian-based MMSE estimator for speech enhancement , 2007, Speech Commun..

[27]  G. Duclos New York 1987 , 2000 .

[28]  Israel Cohen,et al.  Relaxed statistical model for speech enhancement and a priori SNR estimation , 2005, IEEE Transactions on Speech and Audio Processing.

[29]  Rainer Martin,et al.  MMSE estimation of magnitude-squared DFT coefficients with superGaussian priors , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[30]  Li Deng,et al.  HMM adaptation using vector taylor series for noisy speech recognition , 2000, INTERSPEECH.

[31]  Walter Kellermann,et al.  Computationally efficient frequency-domain combination of acoustic echo cancellation and robust adaptive beamforming , 2001, INTERSPEECH.

[32]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[33]  Francesco Piazza,et al.  Robust speech recognition using feature-domain multi-channel bayesian estimators , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[34]  Li Deng,et al.  A Bayesian approach to speech feature enhancement using the dynamic cepstral prior , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35]  Paul R. White,et al.  Speech spectral amplitude estimators using optimally shaped Gamma and Chi priors , 2009, Speech Commun..

[36]  Marc Moonen,et al.  Design of broadband beamformers robust against microphone position errors , 2003 .

[37]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[38]  Y. Gong A method of joint compensation of additive and convolutive distortions for speaker-independent speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[39]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[40]  L. J. Griffiths,et al.  An alternative approach to linearly constrained adaptive beamforming , 1982 .

[41]  Jacob Benesty,et al.  Speech Enhancement , 2010 .

[42]  John McDonough,et al.  Distant Speech Recognition , 2009 .

[43]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .

[44]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[45]  Joseph Lipka,et al.  A Table of Integrals , 2010 .

[46]  Kuldip K. Paliwal,et al.  A Comparative Study of Filter Bank Spacing for Speech Recognition , 2003 .

[47]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[48]  Tariq S. Durrani,et al.  A Novel Psychoacoustically Motivated Multichannel Speech Enhancement System , 2007, COST 2102 Workshop.

[49]  Simon J. Godsill,et al.  Towards a perceptually optimal spectral amplitude estimator for audio signal enhancement , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[50]  F. Piazza,et al.  A Multichannel Noise Reduction Front-End Based on Psychoacoustics for Robust Speech Recognition in Highly Noisy Environments , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.

[51]  Janienke Sturm,et al.  Influencing social dynamics in meetings through a peripheral display , 2007, ICMI '07.

[52]  Maurizio Omologo,et al.  Microphone array based speech recognition with different talker-array positions , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[53]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[54]  Akihiko Sugiyama,et al.  A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters , 1999, IEEE Trans. Signal Process..

[55]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[56]  Rainer Martin,et al.  Speech enhancement based on minimum mean-square error estimation and supergaussian priors , 2005, IEEE Transactions on Speech and Audio Processing.

[57]  Mark J. F. Gales,et al.  An improved approach to the hidden Markov model decomposition of speech and noise , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[58]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[59]  Yifan Gong,et al.  A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions , 2009, Computer Speech and Language.

[60]  Francesco Piazza,et al.  Comparative Evaluation of Single-Channel MMSE-Based Noise Reduction Schemes for Speech Recognition , 2010, J. Electr. Comput. Eng..

[61]  Brian C J Moore,et al.  Asymmetry of masking between complex tones and noise: the role of temporal structure and peripheral compression. , 2002, The Journal of the Acoustical Society of America.

[62]  Richard Heusdens,et al.  A STUDY OF THE DISTRIBUTION OF TIME-DOMAIN SPEECH SAMPLES AND DISCRETE FOURIER COEFFICIENTS , 2005 .

[63]  Mei-Yuh Hwang,et al.  Shared-distribution hidden Markov models for speech recognition , 1993, IEEE Trans. Speech Audio Process..

[64]  S. Gazor,et al.  Speech probability distribution , 2003, IEEE Signal Processing Letters.

[65]  Mark J. F. Gales,et al.  Model-based techniques for noise robust speech recognition , 1995 .

[66]  Lin-Shan Lee,et al.  Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition , 2009, IEEE Trans. Speech Audio Process..

[67]  Chong Kwan Un,et al.  Speech recognition in noisy environments using first-order vector Taylor series , 1998, Speech Commun..

[68]  S. Molau,et al.  Feature space normalization in adverse acoustic conditions , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[69]  Ehud Weinstein,et al.  System identification using nonstationary signals , 1996, IEEE Trans. Signal Process..

[70]  Satoshi Nakamura,et al.  Multichannel Bin-Wise Robust Frequency-Domain Adaptive Filtering and Its Application to Adaptive Beamforming , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[71]  R. Gray,et al.  Distortion measures for speech processing , 1980 .

[72]  Li Deng,et al.  Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise , 2004, IEEE Transactions on Speech and Audio Processing.

[73]  Tran Huy Dat,et al.  Generalized gamma modeling of speech and its online estimation for speech enhancement , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[74]  Yifan Gong,et al.  Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[75]  Yi Hu,et al.  Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..

[76]  Rainer Martin,et al.  MAP Estimators for Speech Enhancement Under Normal and Rayleigh Inverse Gaussian Distributions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[77]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[78]  Paul Lamere,et al.  Sphinx-4: a flexible open source framework for speech recognition , 2004 .

[79]  Francesco Piazza,et al.  NU-Tech: Implementing DSP Algorithms in a Plug-in Based Software Platform for Real time Audio Applications , 2005 .

[80]  Walter Bender,et al.  The Impact of Increased Awareness While Face-to-Face , 2007, Hum. Comput. Interact..

[81]  A. Oppenheim,et al.  Unequal bandwidth spectral analysis using digital frequency warping , 1974 .

[82]  Richard P. Lippmann,et al.  Two-stage discriminant analysis for improved isolated-word recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[83]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[84]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[85]  Richard M. Stern,et al.  Microphone array processing for robust speech recognition , 2003 .

[86]  D. Brillinger Time series - data analysis and theory , 1981, Classics in applied mathematics.

[87]  Edward Courtney,et al.  2 = 4 M , 1993 .

[88]  Antonio Rubio,et al.  Histogram Equalization for Robust Speech Recognition , 2008 .

[89]  Massimo Zancanaro,et al.  Fostering conversation after the museum visit: a WOZ study for a shared interface , 2008, AVI '08.

[90]  Victor Zue,et al.  A segment-based wordspotter using phonetic filler models , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[91]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[92]  J. Capon High-resolution frequency-wavenumber spectrum analysis , 1969 .

[93]  Biing-Hwang Juang,et al.  Signal bias removal by maximum likelihood estimation for robust telephone speech recognition , 1996, IEEE Trans. Speech Audio Process..

[94]  Renato De Mori,et al.  Automatic speech recognition with a modified Ephraim-Malah rule , 2006, IEEE Signal Processing Letters.

[95]  Henry Cox,et al.  Robust adaptive beamforming , 2005, IEEE Trans. Acoust. Speech Signal Process..

[96]  I. Cohen,et al.  Multichannel signal detection based on the transient beam-to-reference ratio , 2003, IEEE Signal Processing Letters.

[97]  Pedro J. Moreno,et al.  Speech recognition in noisy environments , 1996 .

[98]  Peter Vary,et al.  Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model , 2005, EURASIP J. Adv. Signal Process..

[99]  Jesper Jensen,et al.  Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors , 2007, IEEE Transactions on Audio, Speech, and Language Processing.