Improving Quality of Service in Baseband Speech Communication

Speech is the most important communication modality for human interaction. Automatic speech recognition and speech synthesis have extended further the relevance of speech to man-machine interaction ...

[1]  W. Bastiaan Kleijn,et al.  Rephrasing-based speech intelligibility enhancement , 2013, INTERSPEECH.

[2]  Peter Vary,et al.  Near end listening enhancement optimized with respect to Speech Intelligibility Index , 2009, 2009 17th European Signal Processing Conference.

[3]  Yannis Stylianou,et al.  Evaluating the intelligibility benefit of speech modifications in known noise conditions , 2013, Speech Commun..

[4]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[5]  Catherine L. Rogers,et al.  Conversational and clear speech intelligibility of /bVd/ syllables produced by native and non-native English speakers. , 2010, The Journal of the Acoustical Society of America.

[6]  L D Braida,et al.  Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. , 1994, The Journal of the Acoustical Society of America.

[7]  Philipos C. Loizou,et al.  Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  W. Bastiaan Kleijn,et al.  A bayesian hierarchical mixture of experts approach to estimate speech quality , 2010, 2010 Second International Workshop on Quality of Multimedia Experience (QoMEX).

[9]  Simon King,et al.  Intelligibility enhancement of HMM-generated speech in additive noise by modifying Mel cepstral coefficients to increase the glimpse proportion , 2014, Comput. Speech Lang..

[10]  M. V. Jambunathan Some Properties of Beta and Gamma Distributions , 1954 .

[11]  T Houtgast,et al.  A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[12]  L. Dworsky An Introduction to Probability , 2008 .

[13]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[14]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[15]  W. Bastiaan Kleijn,et al.  Feature set augmentation for enhancing the performance of a non-intrusive quality predictor , 2012, 2012 Fourth International Workshop on Quality of Multimedia Experience.

[16]  Weisi Lin,et al.  Non-intrusive Speech Quality Assessment with Support Vector Regression , 2010, MMM.

[17]  Snr Recovery NEAR END LISTENING ENHANCEMENT: SPEECH INTELLIGIBILITY IMPROVEMENT IN NOISY ENVIRONMENTS , 2006 .

[18]  C. Rasmussen,et al.  Gaussian Process Priors with Uncertain Inputs - Application to Multiple-Step Ahead Time Series Forecasting , 2002, NIPS.

[19]  P. Flipsen,et al.  Measuring the intelligibility of conversational speech in children , 2006, Clinical linguistics & phonetics.

[20]  E. Coddington,et al.  Theory of Ordinary Differential Equations , 1955 .

[21]  Joyce Snell,et al.  6. Alternative Methods of Regression , 1996 .

[22]  Werner Verhelst,et al.  An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Ute Jekosch,et al.  Voice and Speech Quality Perception: Assessment and Evaluation , 2005 .

[24]  Ton Kalker,et al.  On the quality-assessment of reverberated speech , 2012, Speech Commun..

[25]  Tiago H. Falk,et al.  Nonintrusive speech quality estimation using Gaussian mixture models , 2006, IEEE Signal Processing Letters.

[26]  Doh-Suk Kim,et al.  ANIQUE: An Auditory Model for Single-Ended Speech Quality Estimation , 2005, IEEE Trans. Speech Audio Process..

[27]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[28]  W. Bastiaan Kleijn,et al.  Preservation of speech spectral dynamics enhances intelligibility , 2013, INTERSPEECH.

[29]  Mike P. Hollier,et al.  Non-intrusive speech-quality assessment using vocal-tract models , 2000 .

[30]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  D. Abrams,et al.  Right-Hemisphere Auditory Cortex Is Dominant for Coding Syllable Patterns in Speech , 2008, The Journal of Neuroscience.

[32]  K. D. Kryter Methods for the Calculation and Use of the Articulation Index , 1962 .

[33]  Rainer Martin,et al.  Spectral Domain Speech Enhancement Using HMM State-Dependent Super-Gaussian Priors , 2013, IEEE Signal Processing Letters.

[34]  Richard C. Hendriks,et al.  Optimizing Speech Intelligibility in a Noisy Environment: A unified view , 2015, IEEE Signal Processing Magazine.

[35]  W. Bastiaan Kleijn,et al.  Spectral Dynamics Recovery for Enhanced Speech Intelligibility in Noise , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[36]  Pascal Scalart,et al.  Improved Signal-to-Noise Ratio Estimation for Speech Enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[37]  Richard Heusdens,et al.  A speech preprocessing strategy for intelligibility improvement in noise based on a perceptual distortion measure , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  J M Kates,et al.  On using coherence to measure distortion in hearing aids. , 1992, The Journal of the Acoustical Society of America.

[39]  W. Bastiaan Kleijn,et al.  Multizone soundfield reproduction using orthogonal basis expansion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[40]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[41]  James M Kates,et al.  Coherence and the speech intelligibility index. , 2004, The Journal of the Acoustical Society of America.

[42]  Simon King,et al.  The listening talker: A review of human and algorithmic context-induced modifications of speech , 2014, Comput. Speech Lang..

[43]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[44]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[45]  Jacob Benesty,et al.  Springer handbook of speech processing , 2007, Springer Handbooks.

[46]  Rainer Martin,et al.  Objective Intelligibility Measures Based on Mutual Information for Speech Subjected to Speech Enhancement Processing , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[47]  Bert de Vries,et al.  Discrete Choice Models for Non-Intrusive Quality Assessment , 2011, INTERSPEECH.

[48]  William M. Hartmann,et al.  Psychoacoustics: Facts and Models , 2001 .

[49]  R. Courant,et al.  Methods of Mathematical Physics , 1962 .

[50]  A. Nabelek,et al.  Effect of noise and reverberation on binaural and monaural word identification by subjects with various audiograms. , 1981, Journal of speech and hearing research.

[51]  Matti Karjalainen,et al.  A new auditory model for the evaluation of sound quality of audio systems , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[52]  Martin Cooke,et al.  Information-preserving temporal reallocation of speech in the presence of fluctuating maskers , 2013, INTERSPEECH.

[53]  W. Bastiaan Kleijn,et al.  On causal algorithms for speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[54]  W. Bastiaan Kleijn,et al.  A Bayesian approach to non-intrusive quality assessment of speech , 2009, INTERSPEECH.

[55]  Martin Cooke,et al.  A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.

[56]  W. Bastiaan Kleijn,et al.  Probabilistic non-intrusive quality assessment of speech for bounded-scale preference scores , 2010, 2010 Second International Workshop on Quality of Multimedia Experience (QoMEX).

[57]  Richard Heusdens,et al.  Speech energy redistribution for intelligibility improvement in noise based on a perceptual distortion measure , 2014, Comput. Speech Lang..

[58]  R. Niederjohn,et al.  The enhancement of speech intelligibility in high noise levels by high-pass filtering followed by rapid amplitude compression , 1976 .

[59]  E. Owens,et al.  An Introduction to the Psychology of Hearing , 1997 .

[60]  Frank K. Soong,et al.  High performance connected digit recognition, using hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[61]  S. Ferrari,et al.  Beta Regression for Modelling Rates and Proportions , 2004 .

[62]  Frank Lad,et al.  Two Moments of the Logitnormal Distribution , 2008, Commun. Stat. Simul. Comput..

[63]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[64]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[65]  J. Berger,et al.  P.563—The ITU-T Standard for Single-Ended Speech Quality Assessment , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[66]  Morris Tenenbaum,et al.  Ordinary differential equations : an elementary textbook for students of mathematics, engineering, and the sciences , 1963 .

[67]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[68]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[69]  Jesper Jensen,et al.  DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement , 2013, DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement.

[70]  C Elberling,et al.  Non-linear signal processing in digital hearing aids. , 1998, Scandinavian audiology. Supplementum.

[71]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[72]  Bosman Aj,et al.  Intelligibility of Dutch CVC Syllables and Sentences for Listeners with Normal Hearing and with Three Types of Hearing Impairment , 1995 .

[73]  Gustav Eje Henter,et al.  Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech , 2012, INTERSPEECH.

[74]  W. Bastiaan Kleijn,et al.  Objective quality estimation of wide-band speech using a narrow-band prior , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[75]  David Poeppel,et al.  Cortical oscillations and speech processing: emerging computational principles and operations , 2012, Nature Neuroscience.

[76]  Birger Kollmeier,et al.  PEMO-Q—A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[77]  W. Bastiaan Kleijn,et al.  Low-Complexity, Nonintrusive Speech Quality Assessment , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[78]  Oded Ghitza,et al.  Objective Assessment of Speech and Audio Quality - Technology and Applications , 2006, IEEE Trans. Speech Audio Process..

[79]  Mike Brookes,et al.  C-Qual—A Validation of PESQ Using Degradations Encountered in Forensic and Law Enforcement Audio , 2010 .

[80]  Tom Heskes,et al.  Multi-task preference learning with an application to hearing aid personalization , 2010, Neurocomputing.

[81]  W. Bastiaan Kleijn,et al.  A Hierarchical Bayesian Approach to Modeling Heterogeneity in Speech Quality Assessment , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[82]  Jont B. Allen Consonant recognition and the articulation index. , 2005, Journal of the Acoustical Society of America.

[83]  T Dau,et al.  A quantitative model of the "effective" signal processing in the auditory system. I. Model structure. , 1996, The Journal of the Acoustical Society of America.

[84]  Gerardo Hermosillo,et al.  Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[85]  Yannis Stylianou,et al.  Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression , 2012, INTERSPEECH.

[86]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[87]  D. Poeppel,et al.  Temporal context in speech processing and attentional stream selection: A behavioral and neural perspective , 2012, Brain and Language.

[88]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[89]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[90]  Ehab Al-Shaer,et al.  On the impact of loss and delay variation on Internet packet audio transmission , 2006, Comput. Commun..

[91]  Garrett Stuck,et al.  Introduction to Dynamical Systems , 2003 .

[92]  Jesper Jensen,et al.  On Optimal Linear Filtering of Speech for Near-End Listening Enhancement , 2013, IEEE Signal Processing Letters.

[93]  B Hagerman,et al.  Sentences for testing speech intelligibility in noise. , 1982, Scandinavian audiology.

[94]  Dimitris Rizopoulos,et al.  The logistic transform for bounded outcome scores. , 2007, Biostatistics.

[95]  Paavo Alku,et al.  An adaptive post-filtering method producing an artificial Lombard-like effect for intelligibility enhancement of narrowband telephone speech , 2014, Comput. Speech Lang..

[96]  Yan Tang,et al.  Optimised spectral weightings for noise-dependent speech intelligibility enhancement , 2012, INTERSPEECH.

[97]  N I Durlach,et al.  Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech. , 1985, Journal of speech and hearing research.

[98]  Yannis Stylianou,et al.  Approaching speech intelligibility enhancement with inspiration from Lombard and Clear speaking styles , 2014, Comput. Speech Lang..

[99]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[100]  Ann K. Syrdal,et al.  An evaluation of the diagnostic rhyme test , 1998, Int. J. Speech Technol..

[101]  Florian Heiss,et al.  Discrete Choice Methods with Simulation , 2016 .

[102]  Richard C. Hendriks,et al.  A Simple Model of Speech Communication and its Application to Intelligibility Enhancement , 2015, IEEE Signal Processing Letters.

[103]  J. Atchison,et al.  Logistic-normal distributions:Some properties and uses , 1980 .

[104]  H. Saunders,et al.  Probability, Random Variables and Stochastic Processes (2nd Edition) , 1989 .

[105]  Oded Ghitza,et al.  Linking Speech Perception and Neurophysiology: Speech Decoding Guided by Cascaded Oscillators Locked to the Input Rhythm , 2011, Front. Psychology.

[106]  Yi Hu,et al.  Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. , 2009, The Journal of the Acoustical Society of America.

[107]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[108]  Vijay Parsa,et al.  Nonintrusive speech quality evaluation using an adaptive neurofuzzy inference system , 2005, IEEE Signal Processing Letters.

[109]  Richard C. Hendriks,et al.  Multizone Speech Reinforcement , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[110]  S. R. Broom,et al.  VoIP Quality Assessment: Taking Account of the Edge-Device , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[111]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[112]  Raymond L. Goldsworthy,et al.  Analysis of speech-based Speech Transmission Index methods with implications for nonlinear operations. , 2004, The Journal of the Acoustical Society of America.

[113]  Jont B. Allen,et al.  How do humans process and recognize speech? , 1993, IEEE Trans. Speech Audio Process..

[114]  Tiago H. Falk,et al.  A Non-Intrusive Quality and Intelligibility Measure of Reverberant and Dereverberated Speech , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[115]  Tiago H. Falk,et al.  Single-Ended Speech Quality Measurement Using Machine Learning Methods , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[116]  T. Houtgast,et al.  The Modulation Transfer Function in Room Acoustics as a Predictor of Speech Intelligibility , 1973 .

[117]  Yannis Stylianou,et al.  Can modified casual speech reach the intelligibility of clear speech? , 2012, INTERSPEECH.

[118]  Yannis Stylianou,et al.  Time-scale modifications based on a full-band adaptive harmonic model , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[119]  Doh-Suk Kim,et al.  ANIQUE+: A new American national standard for non-intrusive estimation of narrowband speech quality , 2007, Bell Labs Technical Journal.

[120]  Gustav Eje Henter,et al.  Maximizing Phoneme Recognition Accuracy for Enhanced Speech Intelligibility in Noise , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[121]  J. C. Krause,et al.  Acoustic properties of naturally produced clear speech at normal speaking rates. , 1996, The Journal of the Acoustical Society of America.

[122]  Jont B. Allen,et al.  Articulation and Intelligibility , 2005, Synthesis Lectures on Speech and Audio Processing.

[123]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[124]  T. Picton,et al.  Human Cortical Responses to the Speech Envelope , 2008, Ear and hearing.

[125]  Philippe Girard,et al.  Bayesian Analysis of Autocorrelated Ordered Categorical Data for Industrial Quality Monitoring , 2001, Technometrics.

[126]  Torsten Dau,et al.  Prediction of speech intelligibility based on an auditory preprocessing model , 2010, Speech Commun..

[127]  Harvey b. Fletcher,et al.  Speech and hearing in communication , 1953 .

[128]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[129]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[130]  R. McKelvey,et al.  A statistical model for the analysis of ordinal level dependent variables , 1975 .