Glottal features for speech-based cognitive load classification

Cognitive load measurement is important when designing adaptive interfaces that optimize the performance of users working on high mental load tasks. Recent research on automatic speech-based measurement system indicates that cognitive load information is more prominent in the frequency region below 1 kHz. This study investigates the effects of cognitive load on glottal parameters (open quotient, normalized amplitude quotient and speed quotient), and proposes a system employing these parameters as features for cognitive load classification. Analysis of the glottal parameter distributions suggests that an increase in cognitive load can be related to a more creaky voice quality. Additionally, three-class classification results show that score-level fusion of systems based on the glottal features and baseline features (MFCCs, pitch, intensity and shifted delta cepstra) improves the baseline accuracy from 79% to 84%.

[1]  Mike Brookes,et al.  Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Paavo Alku,et al.  Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering , 1991, Speech Commun..

[3]  Eliathamby Ambikairajah,et al.  A non-uniform subband approach to speech-based cognitive load classification , 2009, 2009 7th International Conference on Information, Communications and Signal Processing (ICICS).

[4]  F. Paas,et al.  Cognitive Load Measurement as a Means to Advance Cognitive Load Theory , 2003 .

[5]  D G Childers,et al.  Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.

[6]  Matti Airas,et al.  TKK Aparat: An environment for voice inverse filtering and parameterization , 2008, Logopedics, phoniatrics, vocology.

[7]  Fang Chen,et al.  Speech-based cognitive load monitoring system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Mark A. Clements,et al.  Analysis of glottal waveforms across stress styles , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[9]  P. Alku,et al.  Normalized amplitude quotient for parametrization of the glottal flow. , 2002, The Journal of the Acoustical Society of America.

[10]  Mike Brookes,et al.  Voice source cepstrum coefficients for speaker identification , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.