论文信息 - Algorithms for Improving Audible Quality and Recognition Accuracy of Noisy Speech - 字舞流文

Algorithms for Improving Audible Quality and Recognition Accuracy of Noisy Speech

Simone Cifani | S. Cifani

[1] Ehud Weinstein,et al. Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[2] G. W. Elko,et al. An adaptive close-talking microphone array , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[3] Alejandro Acero,et al. Acoustical and environmental robustness in automatic speech recognition , 1991 .

[4] Guy J. Brown,et al. Speech and crosstalk detection in multichannel audio , 2005, IEEE Transactions on Speech and Audio Processing.

[5] Min-Seok Choi,et al. An improved estimation of a priori speech absence probability for speech enhancement: in perspective of speech perception , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6] Bhaskar D. Rao,et al. All-pole modeling of speech based on the minimum variance distortionless response spectrum , 2000, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[7] Veronique Stouten,et al. Robust Automatic Speech Recognition in Time-Varying Environments (Robuuste automatische spraakherkenning in een tijdsvariërende omgeving) , 2006 .

[8] Chin-Hui Lee,et al. Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[9] Simon J. Godsill,et al. Efficient Alternatives to the Ephraim and Malah Suppression Rule for Audio Signal Enhancement , 2003, EURASIP J. Adv. Signal Process..

[10] L. R. Rabiner,et al. On the application of energy contours to the recognition of connected word sequences , 1984, AT&T Bell Laboratories Technical Journal.

[11] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .

[12] Francesco Piazza,et al. Keyword spotting based system for conversation fostering in tabletop scenarios: Preliminary evaluation , 2009, 2009 2nd Conference on Human System Interactions.

[13] A.V. Oppenheim,et al. Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.

[14] B. Atal. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.

[15] S. Gannot,et al. Speech enhancement based on the general transfer function GSC and postfiltering , 2004, IEEE Trans. Speech Audio Process..

[16] J. Makhoul. Spectral analysis of speech by linear prediction , 1973 .

[17] José L. Pérez-Córdoba,et al. Histogram equalization of speech representation for robust speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[18] R. Redner,et al. Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[19] Hermann Ney,et al. Quantile based histogram equalization for noise robust speech recognition , 2001, INTERSPEECH.

[20] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[21] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .

[22] Douglas D. O'Shaughnessy. Speech Communications: Human and Machine , 2012 .

[23] B.D. Van Veen,et al. Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[24] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[25] I. Cohen,et al. Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[26] Bin Chen,et al. A Laplacian-based MMSE estimator for speech enhancement , 2007, Speech Commun..

[27] G. Duclos. New York 1987 , 2000 .

[28] Israel Cohen,et al. Relaxed statistical model for speech enhancement and a priori SNR estimation , 2005, IEEE Transactions on Speech and Audio Processing.

[29] Rainer Martin,et al. MMSE estimation of magnitude-squared DFT coefficients with superGaussian priors , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[30] Li Deng,et al. HMM adaptation using vector taylor series for noisy speech recognition , 2000, INTERSPEECH.

[31] Walter Kellermann,et al. Computationally efficient frequency-domain combination of acoustic echo cancellation and robust adaptive beamforming , 2001, INTERSPEECH.

[32] Hugo Fastl,et al. Psychoacoustics: Facts and Models , 1990 .

[33] Francesco Piazza,et al. Robust speech recognition using feature-domain multi-channel bayesian estimators , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[34] Li Deng,et al. A Bayesian approach to speech feature enhancement using the dynamic cepstral prior , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35] Paul R. White,et al. Speech spectral amplitude estimators using optimally shaped Gamma and Chi priors , 2009, Speech Commun..

[36] Marc Moonen,et al. Design of broadband beamformers robust against microphone position errors , 2003 .

[37] Olivier Cappé,et al. Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[38] Y. Gong. A method of joint compensation of additive and convolutive distortions for speaker-independent speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[39] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[40] L. J. Griffiths,et al. An alternative approach to linearly constrained adaptive beamforming , 1982 .

[41] Jacob Benesty,et al. Speech Enhancement , 2010 .

[42] John McDonough,et al. Distant Speech Recognition , 2009 .

[43] Norbert Wiener,et al. Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .

[44] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .

[45] Joseph Lipka,et al. A Table of Integrals , 2010 .

[46] Kuldip K. Paliwal,et al. A Comparative Study of Filter Bank Spacing for Speech Recognition , 2003 .

[47] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[48] Tariq S. Durrani,et al. A Novel Psychoacoustically Motivated Multichannel Speech Enhancement System , 2007, COST 2102 Workshop.

[49] Simon J. Godsill,et al. Towards a perceptually optimal spectral amplitude estimator for audio signal enhancement , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[50] F. Piazza,et al. A Multichannel Noise Reduction Front-End Based on Psychoacoustics for Robust Speech Recognition in Highly Noisy Environments , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.

[51] Janienke Sturm,et al. Influencing social dynamics in meetings through a peripheral display , 2007, ICMI '07.

[52] Maurizio Omologo,et al. Microphone array based speech recognition with different talker-array positions , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[53] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[54] Akihiko Sugiyama,et al. A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters , 1999, IEEE Trans. Signal Process..

[55] Michael S. Brandstein,et al. Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[56] Rainer Martin,et al. Speech enhancement based on minimum mean-square error estimation and supergaussian priors , 2005, IEEE Transactions on Speech and Audio Processing.

[57] Mark J. F. Gales,et al. An improved approach to the hidden Markov model decomposition of speech and noise , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[58] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[59] Yifan Gong,et al. A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions , 2009, Computer Speech and Language.

[60] Francesco Piazza,et al. Comparative Evaluation of Single-Channel MMSE-Based Noise Reduction Schemes for Speech Recognition , 2010, J. Electr. Comput. Eng..

[61] Brian C J Moore,et al. Asymmetry of masking between complex tones and noise: the role of temporal structure and peripheral compression. , 2002, The Journal of the Acoustical Society of America.

[62] Richard Heusdens,et al. A STUDY OF THE DISTRIBUTION OF TIME-DOMAIN SPEECH SAMPLES AND DISCRETE FOURIER COEFFICIENTS , 2005 .

[63] Mei-Yuh Hwang,et al. Shared-distribution hidden Markov models for speech recognition , 1993, IEEE Trans. Speech Audio Process..

[64] S. Gazor,et al. Speech probability distribution , 2003, IEEE Signal Processing Letters.

[65] Mark J. F. Gales,et al. Model-based techniques for noise robust speech recognition , 1995 .

[66] Lin-Shan Lee,et al. Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition , 2009, IEEE Trans. Speech Audio Process..

[67] Chong Kwan Un,et al. Speech recognition in noisy environments using first-order vector Taylor series , 1998, Speech Commun..

[68] S. Molau,et al. Feature space normalization in adverse acoustic conditions , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[69] Ehud Weinstein,et al. System identification using nonstationary signals , 1996, IEEE Trans. Signal Process..

[70] Satoshi Nakamura,et al. Multichannel Bin-Wise Robust Frequency-Domain Adaptive Filtering and Its Application to Adaptive Beamforming , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[71] R. Gray,et al. Distortion measures for speech processing , 1980 .

[72] Li Deng,et al. Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise , 2004, IEEE Transactions on Speech and Audio Processing.

[73] Tran Huy Dat,et al. Generalized gamma modeling of speech and its online estimation for speech enhancement , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[74] Yifan Gong,et al. Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[75] Yi Hu,et al. Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..

[76] Rainer Martin,et al. MAP Estimators for Speech Enhancement Under Normal and Rayleigh Inverse Gaussian Distributions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[77] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[78] Paul Lamere,et al. Sphinx-4: a flexible open source framework for speech recognition , 2004 .

[79] Francesco Piazza,et al. NU-Tech: Implementing DSP Algorithms in a Plug-in Based Software Platform for Real time Audio Applications , 2005 .

[80] Walter Bender,et al. The Impact of Increased Awareness While Face-to-Face , 2007, Hum. Comput. Interact..

[81] A. Oppenheim,et al. Unequal bandwidth spectral analysis using digital frequency warping , 1974 .

[82] Richard P. Lippmann,et al. Two-stage discriminant analysis for improved isolated-word recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[83] Alan V. Oppenheim,et al. Discrete-Time Signal Pro-cessing , 1989 .

[84] Israel Cohen,et al. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[85] Richard M. Stern,et al. Microphone array processing for robust speech recognition , 2003 .

[86] D. Brillinger. Time series - data analysis and theory , 1981, Classics in applied mathematics.

[87] Edward Courtney,et al. 2 = 4 M , 1993 .

[88] Antonio Rubio,et al. Histogram Equalization for Robust Speech Recognition , 2008 .

[89] Massimo Zancanaro,et al. Fostering conversation after the museum visit: a WOZ study for a shared interface , 2008, AVI '08.

[90] Victor Zue,et al. A segment-based wordspotter using phonetic filler models , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[91] R. McAulay,et al. Speech enhancement using a soft-decision noise suppression filter , 1980 .

[92] J. Capon. High-resolution frequency-wavenumber spectrum analysis , 1969 .

[93] Biing-Hwang Juang,et al. Signal bias removal by maximum likelihood estimation for robust telephone speech recognition , 1996, IEEE Trans. Speech Audio Process..

[94] Renato De Mori,et al. Automatic speech recognition with a modified Ephraim-Malah rule , 2006, IEEE Signal Processing Letters.

[95] Henry Cox,et al. Robust adaptive beamforming , 2005, IEEE Trans. Acoust. Speech Signal Process..

[96] I. Cohen,et al. Multichannel signal detection based on the transient beam-to-reference ratio , 2003, IEEE Signal Processing Letters.

[97] Pedro J. Moreno,et al. Speech recognition in noisy environments , 1996 .

[98] Peter Vary,et al. Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model , 2005, EURASIP J. Adv. Signal Process..

[99] Jesper Jensen,et al. Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors , 2007, IEEE Transactions on Audio, Speech, and Language Processing.