High-level speech event analysis for cognitive load classification

The Cognitive Load (CL) refers to the load imposed on an individual’s cognitive system when performing a given task, and is usually associated with the limitations of the human working memory. Stress, fatigue, lower ability to make decisions and perceptual narrowing are induced by cognitive overload which occurs when too much information has to be processed. As many physiological measures and for a nonintrusive measurement, speech features have been investigated in order to find reliable indicators of CL levels. In this paper, we have investigated high-level speech events automatically detected using the CMU-Sphinx toolkit for speech recognition. Temporal events (speech onset latency, event starting timecodes, pause and phone segments) were extracted from the speech transcriptions (phoneme, word, silent pause, filled pause, breathing). Seven audio feature sets related to the speech events were designed and assessed. Three-class SVM classifiers (Low, Medium and High level) were developed and assessed on the CSLE (Cognitive-Load with Speech and EGG) databases provided for the Interspeech'2014 Cognitive Load Sub-Challenge. These experiments have shown an improvement of 1.5 % on the Test set compared to the official baseline Unweighted Average Recall (UAR).

[1]  Fabien Ringeval,et al.  The INTERSPEECH 2014 computational paralinguistics challenge: cognitive & physical load , 2014, INTERSPEECH.

[2]  F. Paas,et al.  Instructional control of cognitive load in the training of complex cognitive tasks , 1994 .

[3]  Thomas E. Nygren,et al.  Psychometric Properties of Subjective Workload Measurement Techniques: Implications for Their Use in the Assessment of Perceived Mental Workload , 1991 .

[4]  Björn W. Schuller,et al.  The INTERSPEECH 2010 paralinguistic challenge , 2010, INTERSPEECH.

[5]  P. Chandler,et al.  Cognitive Load Theory and the Format of Instruction , 1991 .

[6]  J. Stroop Studies of interference in serial verbal reactions. , 1992 .

[7]  Randall W. Engle,et al.  THEORETICAL AND REVIEW ARTICLES Working memory span tasks: A methodological review and user's guide , 2005 .

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Fang Chen,et al.  Speech-based cognitive load monitoring system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  John H. L. Hansen,et al.  Analysis and detection of cognitive load and frustration in drivers' speech , 2010, INTERSPEECH.

[11]  Nikolaos G. Bourbakis,et al.  The Significance of Empty Speech Pauses: Cognitive and Algorithmic Issues , 2007, BVAI.

[12]  James R. Glass,et al.  Modeling out-of-vocabulary words for robust speech recognition , 2000, INTERSPEECH.

[13]  Natalie Ruiz Cognitive load measurement in multimodal interfaces , 2011 .

[14]  Xavier Anguera Miró,et al.  Model Complexity Selection and Cross-Validation EM Training for Robust Speaker Diarization , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[15]  A. Baddeley Working memory and language: an overview. , 2003, Journal of communication disorders.

[16]  P. Barrouillet,et al.  Attention switching and working memory spans , 2005 .

[17]  Colin M. Macleod Half a century of research on the Stroop effect: an integrative review. , 1991, Psychological bulletin.

[18]  Christian A. Müller,et al.  Assessment of a User's Time Pressure and Cognitive Load on the Basis of Features of Speech , 2011, Resource-Adaptive Cognitive Processes.

[19]  Per Carlbring,et al.  The Stroop effect on the internet , 2006, Comput. Hum. Behav..

[20]  R. Mayer,et al.  Nine Ways to Reduce Cognitive Load in Multimedia Learning , 2003 .

[21]  Yang Wang,et al.  Multimodal behavior and interaction as indicators of cognitive load , 2012, TIIS.

[22]  Anthony Jameson,et al.  Interpreting symptoms of cognitive load in speech input , 1999 .

[23]  F. Paas,et al.  Cognitive Load Measurement as a Means to Advance Cognitive Load Theory , 2003 .

[24]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[25]  Fang Chen,et al.  Think before you talk: an empirical study of relationship between speech pauses and cognitive load , 2008, OZCHI.

[26]  Vidhyasaharan Sethu,et al.  Investigation of spectral centroid features for cognitive load classification , 2011, Speech Commun..

[27]  Christian A. Müller,et al.  Recognizing Time Pressure and Cognitive Load on the Basis of Speech: An Experimental Study , 2001, User Modeling.

[28]  James Glass,et al.  Modelling out-of-vocabulary words for robust speech recognition , 2002 .