Subjective voice quality evaluation of artificial bandwidth extension: comparing different audio bandwidths and speech codecs

Artificial bandwidth extension (ABE) methods have been developed to improve the quality and intelligibility of telephone speech. In many previous studies, however, the evaluation of ABE has not fully reflected the use of ABE in mobile communication (e.g., evaluation with clean speech without coding). In this study, the subjective quality of ABE was evaluated with absolute category rating (ACR) tests involving both clean and noisy speech, two cutoff frequencies of highpass filtering, and input encoded at different bit rates. Three ABE methods were evaluated, two for narrowband-to-wideband extension and one for wideband-to-superwideband extension. Several speech codecs with different audio bandwidths were included in the tests. Narrowband-to-wideband ABE methods were found to significantly improve the speech quality when no background noise was present, and the mean quality scores were slightly but not significantly increased for noisy speech. Widebandto-superwideband ABE also showed significant improvement in certain conditions with no background noise. ABE did not cause significant decrease of the mean scores in any of the tests.

[1]  Bernd Geiser Beyond Wideband Telephony — Bandwidth Extension for Super-Wideband Speech , 2008 .

[2]  Peter Kabal,et al.  Combining equalization and estimation for bandwidth extension of narrowband speech , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Paavo Alku,et al.  Speech quality prediction for artificial bandwidth extension algorithms , 2013, INTERSPEECH.

[4]  Peter Jax,et al.  On artificial bandwidth extension of telephone speech , 2003, Signal Process..

[5]  Anssi Rämö,et al.  Voice Quality Characterization of IETF Opus Codec , 2011, INTERSPEECH.

[6]  Geun-Bae Song,et al.  A study of HMM-based bandwidth extension of speech signals , 2009, Signal Process..

[7]  Paavo Alku,et al.  Neural Network-Based Artificial Bandwidth Expansion of Speech , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Ingvar Claesson,et al.  Low-complexity feature-mapped speech bandwidth extension , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Paavo Alku,et al.  Artificial bandwidth expansion method to improve intelligibility and quality of AMR-coded narrowband speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10]  Brian Kan-Wing Mak Speech Bandwidth Extension Using Temporal Envelope Modeling , 2008 .

[11]  Xiaodong Li,et al.  Bandwidth extension for speech acquired by laser Doppler vibrometer with an auxiliary microphone , 2015, 2015 10th International Conference on Information, Communications and Signal Processing (ICICS).

[12]  Paavo Alku,et al.  Bandwidth Extension of Telephone Speech Using a Neural Network and a Filter Bank Implementation for Highband Mel Spectrum , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Gerhard Schmidt,et al.  Bandwidth Extension of Telephony Speech , 2008 .

[14]  Paavo Alku,et al.  Conversational quality evaluation of artificial bandwidth extension of telephone speech. , 2012, The Journal of the Acoustical Society of America.

[15]  Patrick A. Naylor,et al.  Voice source estimation for artificial bandwidth extension of telephone speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Alex Acero,et al.  Robust bandwidth extension of noise-corrupted narrowband speech , 2005, INTERSPEECH.

[17]  Hynek Hermansky,et al.  History of modulation spectrum in ASR , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Patrick Bauer,et al.  On improving speech intelligibility in automotive hands-free systems , 2010, IEEE International Symposium on Consumer Electronics (ISCE 2010).

[19]  Paavo Alku,et al.  Development, evaluation and implementation of an artificial bandwidth extension method of telephone speech in mobile terminal , 2009, IEEE Transactions on Consumer Electronics.

[20]  Paavo Alku,et al.  Conversational Evaluation of Speech Bandwidth Extension Using a Mobile Handset , 2012, IEEE Signal Processing Letters.

[21]  Peter Jax,et al.  Feature selection for improved bandwidth extension of speech signals , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Timothy B. Terriberry,et al.  Definition of the Opus Audio Codec , 2012, RFC.

[23]  Keikichi Hirose,et al.  Artificial bandwidth extension based on regularized piecewise linear mapping with discriminative region weighting and long-Span features , 2013, INTERSPEECH.

[24]  Peter Vary,et al.  Artificial Bandwidth Extension of Wideband Speech by Pitch-Scaling of Higher Frequencies , 2013, GI-Jahrestagung.

[25]  Anssi Rämö,et al.  Voice quality evaluation of various codecs , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.