Investigating speech features and automatic measurement of cognitive load

The ability to measure cognitive load level in real time is extremely useful for improving the efficiency of interfaces and contents delivering, especially when interfaces and contents get complex in a multimedia environment. Speech is highly suitable for measuring cognitive load due to its non-intrusive nature and ease of collection. In this paper, we investigated the patterns of prosodic features and confirmed it is relevant to cognitive load. We also explored varied classification techniques to capture those relevant patterns of speech features. Gaussian Mixture Model (GMM), Support Vector Machine (SVM), and a hybrid SVM-GMM based classifiers were investigated with MFCC and pitch features. Individual systems and a fusion based system were evaluated on two different task scenarios - reading comprehension and Stroop test. The SVM-GMM based system achieved the highest performance on both tasks and improved the accuracy of three levels classification to 75.6% and 82.2%, respectively.

[1]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[2]  Patrick Kenny,et al.  Linear and non linear kernel GMM supervector machines for speaker verification , 2007, INTERSPEECH.

[3]  Christian A. Müller,et al.  Recognizing Time Pressure and Cognitive Load on the Basis of Speech: An Experimental Study , 2001, User Modeling.

[4]  Sharon L. Oviatt,et al.  When do we interact multimodally?: cognitive load and multimodal communication patterns , 2004, ICMI '04.

[5]  Fang Chen,et al.  Combining Cepstral and Prosodic Features in Language Identification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[6]  Ron Dumont,et al.  Delis‐Kaplan Executive Function System , 2008 .

[7]  Jewel Swanson The Delis-Kaplan Executive Function System , 2005 .

[8]  Fang Chen,et al.  Automatic cognitive load detection from speech features , 2007, OZCHI '07.

[9]  Anthony Jameson,et al.  Interpreting symptoms of cognitive load in speech input , 1999 .

[10]  Fang Chen,et al.  Towards Automatic Cognitive Load Measurement from Speech Analysis , 2007, HCI.

[11]  B. Atal Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.

[12]  Edith Kaplan,et al.  Reliability and validity of the Delis-Kaplan Executive Function System: An update , 2004, Journal of the International Neuropsychological Society.

[13]  Fang Chen,et al.  Speech-based cognitive load monitoring system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  M. A. Kohler,et al.  Language identification using shifted delta cepstra , 2002, The 2002 45th Midwest Symposium on Circuits and Systems, 2002. MWSCAS-2002..

[15]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[16]  J. Ridley Studies of Interference in Serial Verbal Reactions , 2001 .

[17]  Anil K. Jain,et al.  Large-scale evaluation of multimodal biometric authentication using state-of-the-art systems , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  F. Paas,et al.  Cognitive Load Measurement as a Means to Advance Cognitive Load Theory , 2003 .

[19]  Jean-Luc Gauvain,et al.  Language recognition using phone latices , 2004, INTERSPEECH.