Towards An Intelligent Fuzzy Based Multimodal Two Stage Speech Enhancement System
暂无分享,去创建一个
[1] Ben P. Milner,et al. Enhancing audio speech using visual speech features , 2009, INTERSPEECH.
[2] Brian C J Moore,et al. Evaluation of the noise reduction system in a commercial digital hearing aid: Evaluación del sistema de reducción de ruido en un auxiliar auditivo digital comercial , 2003, International journal of audiology.
[3] Xinge You,et al. A local region based approach to lip tracking , 2012, Pattern Recognit..
[4] Aapo Hyvärinen,et al. A Fast Fixed-Point Algorithm for Independent Component Analysis of Complex Valued Signals , 2000, Int. J. Neural Syst..
[5] Christian Jutten,et al. Separation of Audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli , 2002, EURASIP J. Adv. Signal Process..
[6] Chalapathy Neti,et al. Noisy audio feature enhancement using audio-visual speech data , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[7] Demetri Terzopoulos,et al. Snakes: Active contour models , 2004, International Journal of Computer Vision.
[8] Adel El-Hennawy,et al. Speech recognition using a wavelet transform to establish fuzzy inference system through subtractive clustering and neural network (ANFIS) , 2008, ICONS 2008.
[9] Marios M. Polycarpou,et al. Fuzzy Logic based Switching and Tuning Supervisor for a Multi-variable Multiple Controller , 2007, 2007 IEEE International Fuzzy Systems Conference.
[10] Todd A. Ricketts,et al. Making Sense of Directional Microphone Hearing Aids , 1999 .
[11] Simon Haykin,et al. The Cocktail Party Problem , 2005, Neural Computation.
[12] Maurice Milgram,et al. Multi features models for robust lip tracking , 2008, 2008 10th International Conference on Control, Automation, Robotics and Vision.
[13] Tariq S. Durrani,et al. A Novel Psychoacoustically Motivated Multichannel Speech Enhancement System , 2007, COST 2102 Workshop.
[14] Francis K. Kuk,et al. Improving hearing aid performance in noise: Challenges and strategies , 2002 .
[15] A. Murat Tekalp,et al. Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis , 2007, IEEE Transactions on Multimedia.
[16] Paul A. Lynn,et al. Signal Processing of Speech (Macmillan New Electronics) , 1993 .
[17] W. H. Sumby,et al. Visual contribution to speech intelligibility in noise , 1954 .
[18] J L Schwartz,et al. Audio-visual enhancement of speech in noise. , 2001, The Journal of the Acoustical Society of America.
[19] L. Girin,et al. Fusion of auditory and visual information for noisy speech enhancement: a preliminary study of vowel transitions , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[20] Daniel Freedman,et al. Contour Tracking in Clutter: A Subset Approach , 2004, International Journal of Computer Vision.
[21] Zhihong Zeng,et al. Audio-visual affect recognition through multi-stream fused HMM for HCI , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[22] Conrad Sanderson,et al. Biometric Person Recognition: Face, Speech and Fusion , 2008 .
[23] Leslie S. Smith,et al. Robust sound onset detection using leaky integrate-and-fire neurons with depressing synapses , 2004, IEEE Transactions on Neural Networks.
[24] Christian Jutten,et al. Visual voice activity detection as a help for speech source separation from convolutive mixtures , 2007, Speech Commun..
[25] Benjamin Schrauwen,et al. An overview of reservoir computing: theory, applications and implementations , 2007, ESANN.
[26] Lotfi A. Zadeh,et al. Fuzzy Sets , 1996, Inf. Control..
[27] Ruth A Bentler,et al. Hearing-in-Noise: comparison of listeners with normal and (aided) impaired hearing. , 2004, Journal of the American Academy of Audiology.
[28] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[29] Christopher V. Alvino,et al. Geometric source separation: merging convolutive source separation with geometric beamforming , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).
[30] E. Oja,et al. Independent Component Analysis , 2001 .
[31] Christian Jutten,et al. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..
[32] Jon Barker,et al. Energetic and Informational Masking Effects in an Audiovisual Speech Recognition System , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[33] William K. Pratt,et al. Scene Adaptive Coder , 1984, IEEE Trans. Commun..
[34] Donald J. Schum,et al. Noise‐reduction circuitry in hearing aids: (2) Goals and current strategies , 2003 .
[35] Ioannis Pitas,et al. Rule-based face detection in frontal views , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[36] Michael Lindenbaum,et al. Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).
[37] Jenq-Neng Hwang,et al. Lipreading from color video , 1997, IEEE Trans. Image Process..
[38] Aytekin Bagis,et al. Determining fuzzy membership functions with tabu search - an application to control , 2003, Fuzzy Sets Syst..
[39] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.
[40] Christian Jutten,et al. Mixing Audiovisual Speech Processing and Blind Source Separation for the Extraction of Speech Signals From Convolutive Mixtures , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[41] Timothy F. Cootes,et al. Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..
[42] Paulo J. G. Lisboa,et al. The Use of Artificial Neural Networks in Decision Support in Cancer: a Systematic Review , 2005 .
[43] L. J. Griffiths,et al. An alternative approach to linearly constrained adaptive beamforming , 1982 .
[44] Chalapathy Neti,et al. Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization) , 2002, Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2002.
[45] Anna Esposito,et al. Designing a Fast Neuro-fuzzy System for Speech Noise Cancellation , 2000, MICAI.
[46] Juergen Luettin,et al. Visual speech recognition using active shape models and hidden Markov models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[47] Kah Kay Sung,et al. Learning and example selection for object and pattern detection , 1995 .
[48] E. D. Adrian,et al. The Basis of Sensation , 1928, The Indian Medical Gazette.
[49] Ning Ma,et al. Recent advances in speech fragment decoding techniques , 2006, INTERSPEECH.
[50] Jiri Matas,et al. XM2VTSDB: The Extended M2VTS Database , 1999 .
[51] Jochen J. Steil,et al. Tutorial: Perspectives on Learning with RNNs , 2002 .
[52] S. Rosen. Temporal information in speech: acoustic, auditory and linguistic aspects. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[53] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[54] Mary T Cord,et al. Relationship between laboratory measures of directional advantage and everyday success with directional microphone hearing aids. , 2004, Journal of the American Academy of Audiology.
[55] King Chung,et al. Challenges and Recent Developments in Hearing Aids: Part I. Speech Understanding in Noise, Microphone Technologies and Noise Reduction Algorithms , 2004, Trends in amplification.
[56] Engin Avci,et al. Speech recognition using a wavelet packet adaptive network based fuzzy inference system , 2006, Expert Syst. Appl..
[57] Norbert Wiener,et al. Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .
[58] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[59] Zoubin Ghahramani,et al. An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..
[60] Jung-Hsien Chiang,et al. Handwritten word recognition with character and inter-character neural networks , 1997, IEEE Trans. Syst. Man Cybern. Part B.
[61] Dibyendu Ghoshal,et al. Extraction of time invariant lips based on Morphological Operation and Corner Detection Method , 2012 .
[62] H. Lane,et al. The Lombard Sign and the Role of Hearing in Speech , 1971 .
[63] Yang Lu,et al. A geometric approach to spectral subtraction , 2008, Speech Commun..
[64] Jean-Philippe Thiran,et al. The BANCA Database and Evaluation Protocol , 2003, AVBPA.
[65] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[66] Ehud Weinstein,et al. Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..
[67] Abeer Alwan,et al. On the Relationship between Face Movements, Tongue Movements, and Speech Acoustics , 2002, EURASIP J. Adv. Signal Process..
[68] Thomas S. Huang,et al. Human face detection in a complex background , 1994, Pattern Recognit..
[69] R. E. Carlson,et al. Monotone Piecewise Cubic Interpolation , 1980 .
[70] Yi Hu,et al. Evaluation of objective measures for speech enhancement , 2006, INTERSPEECH.
[71] Mary T Cord,et al. Performance of directional microphone hearing aids in everyday life. , 2002, Journal of the American Academy of Audiology.
[72] Wofgang Maas,et al. Networks of spiking neurons: the third generation of neural network models , 1997 .
[73] Hani Yehia,et al. Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..
[74] Francis Kuk,et al. Performance of a fully adaptive directional microphone to signals presented from various azimuths. , 2005, Journal of the American Academy of Audiology.
[75] Russell M. Mersereau,et al. Lip feature extraction towards an automatic speechreading system , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).
[76] Li Deng,et al. High-performance robust speech recognition using stereo training data , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[77] Stan Z. Li,et al. Jensen-Shannon boosting learning for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[78] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[79] Wolfgang Maass,et al. Movement Generation with Circuits of Spiking Neurons , 2005, Neural Computation.
[80] Christian Jutten,et al. Developing an audio-visual speech source separation algorithm , 2004, Speech Commun..
[81] Shu Hung Leung,et al. Automatic lip contour extraction from color images , 2004, Pattern Recognit..
[82] Giridharan Iyengar,et al. Robust detection of visual ROI for automatic speechreading , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).
[83] Harry Shum,et al. Statistical Learning of Multi-view Face Detection , 2002, ECCV.
[84] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[85] Raghu Krishnapuram,et al. A robust approach to image enhancement based on fuzzy logic , 1997, IEEE Trans. Image Process..
[86] Jon Barker,et al. Audio-visual speech fragment decoding , 2007, AVSP.
[87] Saeed Bagheri Shouraki,et al. Recognition of human speech phonemes using a novel fuzzy approach , 2007, Appl. Soft Comput..
[88] W. Dreschler,et al. Clinical evaluation of a full-digital in-the-ear hearing instrument. , 1999, Audiology : official organ of the International Society of Audiology.
[89] Gunnar Rätsch,et al. An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.
[90] Yi Hu,et al. Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[91] Maurice Milgram,et al. Semi Adaptive Appearance Models for lip tracking , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).
[92] Shuichi Sakamoto,et al. A two‐stage binaural speech enhancement approach for hearing aids with preserving binaural benefits in noisy environments , 2008 .
[93] Amir Hussain,et al. A novel multiple-controller incorporating a radial basis function neural network based generalized learning model , 2006, Neurocomputing.
[94] Alan L. Yuille,et al. Feature extraction from faces using deformable templates , 2004, International Journal of Computer Vision.
[95] Allen R. Tannenbaum,et al. Localizing Region-Based Active Contours , 2008, IEEE Transactions on Image Processing.
[96] T. Houtgast,et al. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .
[97] A. Murat Tekalp,et al. Lip feature extraction based on audio-visual correlation , 2005, 2005 13th European Signal Processing Conference.
[98] John R. Hershey,et al. Audio-Visual Sound Separation Via Hidden Markov Models , 2001, NIPS.
[99] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[100] Miao Yu,et al. A Multimodal Approach to Blind Source Separation of Moving Sources , 2010, IEEE Journal of Selected Topics in Signal Processing.
[101] Kazuo Tanaka,et al. Switching control of an R/C hovercraft: stabilization and smooth switching , 2001, IEEE Trans. Syst. Man Cybern. Part B.
[102] Junfeng Li,et al. Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication , 2011, Speech Commun..
[103] Fabien Ringeval,et al. Maximising Audiovisual Correlation with Automatic Lip Tracking and Vowel Based Segmentation , 2009, COST 2101/2102 Conference.
[104] Sumit Kumar,et al. IMPROVED HYBRID MODEL OF HMM/GMM FOR SPEECH RECOGNITION , 2008 .
[105] Yuting Su,et al. Robust Sea-Sky-Line Detection Based on Horizontal Projection and Hough Transformation , 2009, 2009 2nd International Congress on Image and Signal Processing.
[106] Günther Palm,et al. Spotting laughter in natural multiparty conversations: A comparison of automatic online and offline approaches using audiovisual data , 2012, TIIS.
[107] W. T. Nelson,et al. A speech corpus for multitalker communications research. , 2000, The Journal of the Acoustical Society of America.
[108] Zhengyou Zhang,et al. A Survey of Recent Advances in Face Detection , 2010 .
[109] Albert S. Bregman,et al. Auditory scene analysis : hearing in complex environments , 1993 .
[110] Paula P. Henry,et al. Evaluation of an adaptive, directional-microphone hearing aid: Evaluación de un auxiliar auditivo de micrófono direccional adaptable , 2002, International journal of audiology.
[111] Henry Markram,et al. Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.
[112] E. C. Cherry. Some Experiments on the Recognition of Speech, with One and with Two Ears , 1953 .
[113] C. Jutten,et al. Using a Visual Voice Activity Detector to Regularize the Permutations in Blind Separation of Convolutive Speech Mixtures , 2007, 2007 15th International Conference on Digital Signal Processing.
[114] Chalapathy Neti,et al. Joint audio-visual speech processing for recognition and enhancement , 2003, AVSP.
[115] Richard M. Stern,et al. Environmental robustness in automatic speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[116] Franklin C. Crow,et al. Summed-area tables for texture mapping , 1984, SIGGRAPH.
[117] Narendra Ahuja,et al. Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[118] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[119] Alice Caplier,et al. New color transformation for lips segmentation , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).
[120] Daniel P. W. Ellis,et al. Decoding speech in the presence of other sources , 2005, Speech Commun..
[121] Rainer Lienhart,et al. An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.
[122] Jon Barker,et al. Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models , 1999, AVSP.
[123] John H. L. Hansen,et al. An effective quality evaluation protocol for speech enhancement algorithms , 1998, ICSLP.
[124] Aude Billard,et al. On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[125] Ronald W. Schafer,et al. Digital Processing of Speech Signals , 1978 .
[126] R. Zelinski,et al. A microphone array with adaptive post-filtering for noise reduction in reverberant rooms , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[127] Alan Wee-Chung Liew,et al. Lip contour extraction from color images using a deformable model , 2002, Pattern Recognit..
[128] Ben P. Milner,et al. Maximising audio-visual speech correlation , 2007, AVSP.
[129] Amir Hussain,et al. Intelligibility improvements using binaural diverse sub-band processing applied to speech corrupted with automobile noise , 2001 .
[130] Albert S. Bregman,et al. The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .
[131] Ming Liu,et al. AVICAR: audio-visual speech corpus in a car environment , 2004, INTERSPEECH.
[132] James M. Rehg,et al. On the Design of Cascades of Boosted Ensembles for Face Detection , 2008, International Journal of Computer Vision.