Formant shifting for speech intelligibility improvement in car noise environment

In this paper, we propose a novel approach aiming at improving the intelligibility of speech in the context of in-car applications. Speech produced in noisy environments is subject to the Lombard effect which gathers a number of voice transformation effects compared to the speech produced in calm environments. To improve intelligibility of in car speech (radio, message alerts, ...), we propose to modify the original speech signal by incorporating one of the important Lombard effect, namely the shift of the lower formant center frequencies away from the competing noise regions. The proposed approach exploits traditional Linear Prediction analysis and overlap and add synthesis. We explore several modification strategies and the merit of each modification is evaluated using both objective and subjective tests. It is in particular shown that the improvement of speech intelligibility in car noise is significantly improved for a majority of listeners.

[1]  Yi Hu,et al.  A comparative intelligibility study of single-microphone noise reduction algorithms. , 2007, The Journal of the Acoustical Society of America.

[2]  Yi Hu,et al.  Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. , 2009, The Journal of the Acoustical Society of America.

[3]  Kishore Prahallad,et al.  Voice conversion using Artificial Neural Networks , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  R. H. Bernacki,et al.  Effects of noise on speech production: acoustic and perceptual analyses. , 1988, The Journal of the Acoustical Society of America.

[5]  Martin Cooke,et al.  The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise , 2009, Speech Commun..

[6]  J J O'NEILL,et al.  Effects of ambient noise on speaker intelligibility of words and phrases , 1958, The Laryngoscope.

[7]  Xia Wang,et al.  Text-independent voice conversion based on state mapped codebook , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Jesper Jensen,et al.  A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  S. Soli,et al.  Adaptation of the HINT (hearing in noise test) for adult Canadian Francophone populations , 2005, International journal of audiology.

[10]  Olli Viikki,et al.  Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..

[11]  Paavo Alku,et al.  Spectral tilt modelling with extrapolated GMMs for intelligibility enhancement of narrowband telephone speech , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[12]  Satoshi Nakamura,et al.  Robust speech recognition in car environments , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[13]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[14]  Kate Bunton,et al.  Perceptual effects of a flattened fundamental frequency at the sentence level under different listening conditions. , 2003, Journal of communication disorders.

[15]  Bayya Yegnanarayana,et al.  Voice Conversion by Prosody and Vocal Tract Modification , 2006, 9th International Conference on Information Technology (ICIT'06).

[16]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[17]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[18]  Philipos C. Loizou,et al.  Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Hing-Cheung So,et al.  Speech enhancement in car noise envoronment based on an analysis-synthesis approach using harmonic noise model , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Philipos C. Loizou,et al.  Improving Speech Intelligibility in Noise Using Environment-Optimized Algorithms , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  José B. Mariño,et al.  Speech recognition in a noisy car environment based on LP of the one-sided autocorrelation sequence and robust similarity measuring techniques , 1997, Speech Commun..

[22]  Martin Cooke,et al.  Speech production modifications produced by competing talkers, babble, and stationary noise. , 2008, The Journal of the Acoustical Society of America.

[23]  Cassia Valentini-Botinhao,et al.  Intelligibility-enhancing speech modifications: the hurricane challenge , 2020, INTERSPEECH.

[24]  S. Soli,et al.  Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. , 1994, The Journal of the Acoustical Society of America.

[25]  Minsoo Hahn,et al.  A New Speech Enhancement Algorithm for Car Environment Noise Cancellation with MBD and Kalman Filtering , 2005, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[26]  Yuezhong Tang,et al.  A Parametric Approach for Voice Conversion , 2006 .