Effects of time windowing for extraction of expression from Japanese speech: Higuchi's fractal dimension

Measures of irregularity and complexity in time-series data can be quantified by using fractal dimension (FD) which may convey useful information and showing a strong correlation with the physiological data, e.g., speech. The time-dependent FD (TDFD) has been proven to be effective method in various physiological studies. However, the TDFD result depends on the time windowing, i.e., length and type which could provide different features. Therefore, in this paper, the effects of time windowing of TDFD by spanning 28 to 212-point length and four window functions were investigated. The FD was computed by using Higuchi's method. In experiment, speech data recorded from 15 Japanese utterances, 4 different ways of expression (accosting, wholehearted, normal, and uninterested). The obtained results were useful in selecting time windowing based on TDFD method in which could extract different intonations effectively.

[1]  Sonja A. Kotz,et al.  How aging affects the recognition of emotional speech , 2008, Brain and Language.

[2]  Amitava Chatterjee,et al.  Support vector machines employing cross-correlation for emotional speech recognition , 2009 .

[3]  Klaus R. Scherer,et al.  Speech emotion analysis , 2008, Scholarpedia.

[4]  Lijiang Chen,et al.  Speech Emotion Recognition Based on Parametric Filter and Fractal Dimension , 2010, IEICE Trans. Inf. Syst..

[5]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[6]  Mann Oo. Hay Emotion recognition in human-computer interaction , 2012 .

[7]  Yongzhao Zhan,et al.  Extraction and analysis of the speech emotion features based on multi-fractal spectrum , 2010, International journal of computer application and technology.

[8]  K. Torre,et al.  Fractal analyses for 'short' time series: A re-assessment of classical methods , 2006 .

[9]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[10]  Montri Phothisonothai,et al.  Extraction of expression from Japanese speech based on time-frequency and fractal features , 2013, 2013 10th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology.

[11]  Masahiro Nakagawa,et al.  A study of time-dependent fractal dimensions of vocal sounds , 1995 .

[12]  J. G. Taylor,et al.  Emotion recognition in human-computer interaction , 2005, Neural Networks.

[13]  T. Higuchi Approach to an irregular time series on the basis of the fractal theory , 1988 .

[14]  Montri Phothisonothai,et al.  Fractal-based EEG data analysis of body parts movement imagery tasks. , 2007, The journal of physiological sciences : JPS.

[15]  Masahiro Nakagawa,et al.  The fractal properties of vocal sounds and their application in the speech recognition model , 1996 .

[16]  K. Scherer Vocal affect expression: a review and a model for future research. , 1986, Psychological bulletin.

[17]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..