论文信息 - Personalized Task Load Prediction in Speech Communication

Personalized Task Load Prediction in Speech Communication

Estimating the quality of remote speech communication is a complex task influenced by the speaker, transmission channel, and listener. For example, the degradation of transmission quality can increase listeners' cognitive load, which can influence the overall perceived quality of the conversation. This paper presents a framework that isolates quality-dependent changes and controls most outside influencing factors like personal preference in a simulated conversational environment. The performed statistical analysis finds significant relationships between stimulus quality and the listener's valence and personality (agreeableness and openness) and, similarly, between the perceived task load during the listening task and the listener's personality and frustration intolerance. The machine learning model of the task load prediction improves the correlation coefficients from 0.48 to 0.76 when listeners' individuality is considered. The proposed evaluation framework and results pave the way for personalized audio quality assessment that includes speakers' and listeners' individuality beyond conventional channel modeling.

Karl El Hajal | M. Cernak | Sebastian Möller | R. Spang

[1] S. Möller,et al. The Story time Dataset: Simulated Videotelephony Clips for Quality Perception Research , 2022, 2022 14th International Conference on Quality of Multimedia Experience (QoMEX).

[2] A. Finkelstein,et al. Audio Similarity is Unreliable as a Proxy for Audio Quality , 2022, INTERSPEECH.

[3] Karl El Hajal,et al. BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping , 2022, HEAR@NeurIPS.

[4] Yuan-Gen Wang,et al. Texture Information Boosts Video Quality Assessment , 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5] A. Bovik,et al. No-Reference Quality Assessment of Variable Frame-Rate Videos Using Temporal Bandpass Statistics , 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Jonathan Levav,et al. Virtual communication curbs creative idea generation , 2022, Nature.

[7] Karl El Hajal,et al. MOSRA: Joint Mean Opinion Score and Room Acoustics Speech Quality Assessment , 2022, INTERSPEECH.

[8] M. Cernak,et al. Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load , 2022, INTERSPEECH.

[9] P. Callet,et al. Subjective And Objective Quality Assessment Of Mobile Gaming Video , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10] Simon King,et al. Measuring the Cognitive Load of Synthetic Speech Using a Dual Task Paradigm , 2018, INTERSPEECH.

[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .

[13] Fabien Ringeval,et al. The INTERSPEECH 2014 computational paralinguistics challenge: cognitive & physical load , 2014, INTERSPEECH.

[14] Shrikanth S. Narayanan,et al. Classification of cognitive load from speech using an i-vector framework , 2014, INTERSPEECH.

[15] Michael Keyhl,et al. Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard for End-to-End Speech Quality Measurement Part I-Temporal Alignment , 2013 .

[16] Yves Rosseel,et al. lavaan: An R Package for Structural Equation Modeling , 2012 .

[17] Eero Väyrynen,et al. Effect of cognitive load on speech prosody in aviation: Evidence from military simulator flights. , 2011, Applied ergonomics.

[18] Neil Harrington,et al. The Frustration Discomfort Scale: development and psychometric properties , 2005 .

[19] Methods for objective and subjective assessment of quality Subjective quality evaluation of telephone services based on spoken dialogue systems , 2004 .

[20] Klaus R. Scherer,et al. Acoustic correlates of task load and stress , 2002, INTERSPEECH.

[21] P. Borkenau,et al. NEO-Fünf-Faktoren-Inventar (NEO-FFI) nach Costa und McCrae : Handanweisung , 1993 .

[22] S. Hart,et al. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .