A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean

In studies on artificial bandwidth extension (ABE), there is a lack of international coordination in subjective tests between multiple methods and languages. Here we present the design of absolute category rating listening tests evaluating 12 ABE variants of six approaches in multiple languages, namely in American English, Chinese, German, and Korean. Since the number of ABE variants caused a higher-than-recommended length of the listening test, ABE variants were distributed into two separate listening tests per language. The paper focuses on the listening test design, which aimed at merging the subjective scores of both tests and thus allows for a joint analysis of all ABE variants under test at once. A language-dependent analysis, evaluating ABE variants in the context of the underlying coded narrowband speech condition showed statistical significant improvement in English, German, and Korean for some ABE solutions.

[1]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[2]  Peter Vary,et al.  Audiosignalverarbeitung für Videokonferenzsysteme , 2013, GI-Jahrestagung.

[3]  Israel Cohen,et al.  Evaluation of a Speech Bandwidth Extension Algorithm Based on Vocal Tract Shape Estimation , 2012, IWAENC.

[4]  Patrick Bauer,et al.  HMM-based artificial bandwidth extension supported by neural networks , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[5]  Engin Erzin,et al.  Artificial bandwidth extension of spectral envelope along a Viterbi path , 2013, Speech Commun..

[6]  Peter Kabal,et al.  Memory-Based Approximation of the Gaussian Mixture Model Framework for Bandwidth Extension of Narrowband Speech , 2011, INTERSPEECH.

[7]  Paavo Alku,et al.  Evaluation of an Artificial Speech Bandwidth Extension Method in Three Languages , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Paavo Alku,et al.  Speech quality evaluation of artificial bandwidth extension: comparing subjective judgments and instrumental predictions , 2015, INTERSPEECH.

[9]  Patrick Bauer,et al.  On speech quality assessment of artificial bandwidth extension , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Peter Jax,et al.  On artificial bandwidth extension of telephone speech , 2003, Signal Process..

[11]  Engin Erzin,et al.  Synchronous overlap and add of spectra for enhancement of excitation in artificial bandwidth extension of speech , 2015, INTERSPEECH.

[12]  M A Mines,et al.  Frequency of Occurrence of Phonemes in Conversational English , 1978, Language and speech.

[13]  Paavo Alku,et al.  Bandwidth Extension of Telephone Speech Using a Neural Network and a Filter Bank Implementation for Highband Mel Spectrum , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Tim Fingscheidt,et al.  A Phonetic Reference Paradigm for Instrumental Speech Quality Assessment of Artificial Speech Bandwidth Extension , 2017 .

[15]  Patrick Bauer,et al.  A statistical framework for artificial bandwidth extension exploiting speech waveform and phonetic transcription , 2009, 2009 17th European Signal Processing Conference.

[16]  Peter Kabal,et al.  Combining frontend-based memory with MFCC features for Bandwidth Extension of narrowband speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Peter Jax,et al.  Wideband extension of telephone speech using a hidden Markov model , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[18]  Paavo Alku,et al.  Speech quality prediction for artificial bandwidth extension algorithms , 2013, INTERSPEECH.