A Survey on Techniques for Enhancing Speech

Speech enhancement is used in almost all the modern communication systems. It is obvious that when speech is being transmitted, its quality may degrade due to interference in the environment it is passing through. Some of the interferences that may affect the speech quality of transit include acoustic additive noise, acoustic reverberation or white Gaussian noise. This paper focuses on the techniques that appeared in the literature to enhance the signal of speech. Various methods used include wiener filter, statistical methods, subspace method, basic spectral subtraction method and spectral subtraction. In this paper authors will discuss various such methods along with their advantages and disadvantages. The discussion will also review the studies conducted by other researchers on other machine learning techniques, such as Neural network, Deep Neural Network ,Convolution Neural Networks and optimization techniques which used for the enhancement of speech.

[1]  S. El-Rabaie,et al.  Speech enhancement with an adaptive Wiener filter , 2013, International Journal of Speech Technology.

[2]  T. Shimamura,et al.  Reinforced spectral subtraction method to enhance speech signal , 2001, Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology. TENCON 2001 (Cat. No.01CH37239).

[3]  R Ratnadeep,et al.  A Review of Speech Signal Enhancement Techniques , 2016 .

[4]  T. Kishore Kumar,et al.  Perceptual subspace speech enhancement with SSDR normalization , 2016, 2016 International Conference on Microelectronics, Computing and Communications (MicroCom).

[5]  Jean-Pierre Bresciani,et al.  Vision and touch are automatically integrated for the perception of sequences of events. , 2006, Journal of vision.

[6]  S. B. Dhonde,et al.  A review on speech enhancement techniques , 2015, 2015 International Conference on Pervasive Computing (ICPC).

[7]  Laleh Badri Asl,et al.  Dual-channel speech enhancement based on stochastic optimization strategies , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[8]  Justinian P. Rosca,et al.  Bayesian single channel speech enhancement exploiting sparseness in the ICA domain , 2004, 2004 12th European Signal Processing Conference.

[9]  K. Prajna,et al.  Application of Bat Algorithm in dual channel speech enhancement , 2014, 2014 International Conference on Communication and Signal Processing.

[10]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[11]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[12]  T. Kishore Kumar,et al.  A Novel RLS Based Adaptive Filtering Method for Speech Enhancement , 2015 .

[13]  M. Wallace,et al.  Unifying multisensory signals across time and space , 2004, Experimental Brain Research.

[14]  Julien Pinquier,et al.  A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002 .

[15]  Shishir Banchhor,et al.  GUI Based Performance Analysis of Speech Enhancement Techniques , 2013 .

[16]  Jessica J. M. Monaghan,et al.  Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users , 2017, Hearing Research.

[17]  Yi Hu,et al.  A subspace approach for enhancing speech corrupted by colored noise , 2002, IEEE Signal Processing Letters.

[18]  Chunhe Yu,et al.  Speech enhancement based on the generalized sidelobe cancellation and spectral subtraction for a microphone array , 2015, 2015 8th International Congress on Image and Signal Processing (CISP).

[19]  Jérôme Boudy,et al.  Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov models and the projection, for robust speech recognition in cars , 1991, Speech Commun..

[20]  M. A. A. El-Fattah,et al.  Speech Enhancement Using an Adaptive Wiener Filtering Approach , 2008 .

[21]  Wang Guang Yan,et al.  A Signal Subspace Speech Enhancement Method for Various Noises , 2013 .

[22]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[23]  Li-Rong Dai,et al.  A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[24]  Guo-Hong Ding,et al.  Suppression of additive noise using a power spectral density MMSE estimator , 2004, IEEE Signal Processing Letters.

[25]  Arne Leijon,et al.  A new linear MMSE filter for single channel speech enhancement based on Nonnegative Matrix Factorization , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[26]  Chen Yi,et al.  Multi-band spectral subtraction method combined with auditory masking properties for speech enhancement , 2012, 2012 5th International Congress on Image and Signal Processing.

[27]  Ben P. Milner,et al.  Enhancing audio speech using visual speech features , 2009, INTERSPEECH.

[28]  Amir Hussain,et al.  A Two Stage Multimodal Speech Enhancement System , 2015 .

[29]  Chalapathy Neti,et al.  Noisy audio feature enhancement using audio-visual speech data , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  J. Driver Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading , 1996, Nature.

[31]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[32]  Xiuhua Geng,et al.  A signal subspace approach for speech enhancement , 2014 .

[33]  Zhao Wei,et al.  Improvement of audio noise reduction system based on RLS algorithm , 2013, Proceedings of 2013 3rd International Conference on Computer Science and Network Technology.

[34]  Hwai-Tsu Hu,et al.  Supplementary schemes to spectral subtraction for speech enhancement , 2002, Speech Commun..

[35]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .

[36]  Mitesh A Patel,et al.  ADAPTIVE NOISE CANCELLATION , 2014 .

[37]  R. D. Kharadkar,et al.  Comparative performance analysis and hardware implementation of adaptive filter algorithms for acoustic noise cancellation , 2015, 2015 International Conference on Information Processing (ICIP).

[38]  Y. Hu,et al.  A subspace approach for enhancing speech corrupted by colored noise , 2002, IEEE Signal Process. Lett..

[39]  R. Uma Maheswari,et al.  A new approach to dual channel speech enhancement based on hybrid PSOGSA , 2015, Int. J. Speech Technol..

[40]  Oscar Castillo,et al.  Modification of the Bat Algorithm using fuzzy logic for dynamical parameter adaptation , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[41]  Navneet Upadhyay,et al.  An Improved Multi-Band Spectral Subtraction Algorithm for Enhancing Speech in Various Noise Environments , 2013 .

[42]  E. Hari Krishna,et al.  Acoustic echo cancellation using a computationally efficient transform domain LMS adaptive filter , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[43]  A.V. Oppenheim,et al.  Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.

[44]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[45]  Ravi Kumar A Survey on Speech Enhancement Methodologies , 2016 .

[46]  D Soumasunderaswari.,et al.  A survey on various multichannel speech enhancement algorithms , 2015 .

[47]  G. Prabhakaran,et al.  Tamil speech enhancement using non-linear spectral subtraction , 2014, 2014 International Conference on Communication and Signal Processing.

[48]  Amir Hussain,et al.  Towards Fuzzy Logic Based Multimodal Speech Filtering , 2015 .

[49]  K. V. V. S. Reddy,et al.  Metaheuristic Applications to Speech Enhancement , 2016 .

[50]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[51]  Francesco Piazza,et al.  Nonlinear Speech Enhancement: An Overview , 2005, WNSP.

[52]  Jyoti Dhiman,et al.  Comparison between Adaptive filter Algorithms (LMS, NLMS and RLS) , 2013 .

[53]  Benoît Champagne,et al.  Incorporating the human hearing properties in the signal subspace approach for speech enhancement , 2003, IEEE Trans. Speech Audio Process..

[54]  Simon Haykin,et al.  Adaptive filter theory (2nd ed.) , 1991 .

[55]  Yu Tsao,et al.  SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement , 2016, INTERSPEECH.

[56]  Mohamed Djendi,et al.  A new dual forward BSS based RLS (DFRLS) algorithm for speech enhancement , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[57]  Stefan J. Mauger,et al.  Clinical evaluation of the Nucleus® 6 cochlear implant system: Performance improvements with SmartSound iQ , 2014, International journal of audiology.

[58]  Philipos C. Loizou,et al.  A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[59]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[60]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[61]  Masoud Geravanchizadeh,et al.  Dual-channel speech enhancement based on a hybrid particle swarm optimization algorithm , 2010, 2010 5th International Symposium on Telecommunications.

[62]  Yu Tsao,et al.  Complex spectrogram enhancement by convolutional neural network with multi-metrics learning , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[63]  Eliathamby Ambikairajah,et al.  Speech Enhancement based on Wiener Filter and Compressive Sensing , 2016 .

[64]  R. Uma Maheswari,et al.  A new approach to dual channel speech enhancement based on gravitational search algorithm (GSA) , 2014, Int. J. Speech Technol..

[65]  Jun Du,et al.  An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.

[66]  Y. Leng,et al.  A Signal Subspace Speech Enhancement Approach Based on Joint Low-Rank and Sparse Matrix Decomposition , 2016 .

[67]  C. Shahnaz,et al.  Speech enhancement based on a modified spectral subtraction method , 2014, 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS).

[68]  Hardik Panchmatia,et al.  Comparison of Different Speech Enhancement Techniques , 2016 .

[69]  Xin-She Yang,et al.  Nature-Inspired Metaheuristic Algorithms , 2008 .

[70]  Chaohuan Hou,et al.  Subband spectral-subtraction speech enhancement based on the DFT modulated filter banks , 2012, 2012 IEEE 11th International Conference on Signal Processing.

[71]  S. Thomas Alexander,et al.  Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[72]  Shambhu Shankar Bharti,et al.  A new spectral subtraction method for speech enhancement using adaptive noise estimation , 2016, 2016 3rd International Conference on Recent Advances in Information Technology (RAIT).

[73]  Geraint Rees,et al.  Sound alters activity in human V1 in association with illusory visual perception , 2006, NeuroImage.

[74]  T. Kishore Kumar,et al.  A Survey on Statistical Based Single Channel Speech Enhancement Techniques , 2014 .

[75]  Gianluca Monaci,et al.  On the modelling of multi-modal data using redundant dictionaries , 2007 .

[76]  Yu Tsao,et al.  Audio-Visual Speech Enhancement based on Multimodal Deep Convolutional Neural Network , 2017, ArXiv.

[77]  Jessica Koehler,et al.  Advanced Digital Signal Processing And Noise Reduction , 2016 .

[78]  Peter Vary,et al.  Binaural dereverberation based on a dual-channel Wiener filter with optimized noise field coherence , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[79]  K. V. V. S. Reddy,et al.  Speech Enhancement Based on Bat Algorithm (BA) , 2016 .

[80]  Shinsuke Shimojo,et al.  Touch-induced visual illusion , 2005, Neuroreport.