Automatic Personality Estimation

This chapter describes how personality cues can be estimated from speech using an automated system. In order to enable a machine to come to a decision or estimation about any speakers’ personality a comprehensive processing chain needs to be developed. First, the recordings need to be preprocessed and segmented into meaningful chunks. Next, promising acoustic or prosodic cues need to be extracted from the signal. Because the realization of a feature candidate to be ‘promising’ can be expected to vary between the traits, this work incorporates a generic feature selection scheme in order to identify promising features from a completely data-driven perspective, namely the IGR ranking algorithm. Using promising features discriminative models are trained for classification and regression tasks using support vector machines. The whole learning scheme is evaluated by cross-validation applying an appropriate evaluation metric. Results are finally given by three data subsets, by classification and prediction success as well as by individual trait scores by means of confusion matrices, iterative accuracy plots and charts giving insights on feature space composition.

[1]  Alex Acero,et al.  Spoken Language Processing , 2001 .

[2]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[3]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[4]  Douglas D. O'Shaughnessy Speech Communications: Human and Machine , 2012 .

[5]  Satoshi Nakamura,et al.  Spoken Dialogue Systems Technology and Design , 2011 .

[6]  Tim Polzehl,et al.  Facing Reality: Simulating Deployment of Anger Recognition in IVR Systems , 2010, IWSDS.

[7]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[8]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[9]  Björn W. Schuller,et al.  Late fusion of individual engines for improved recognition of negative emotion in speech - learning vs. democratic vote , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  S. Kotsiantis,et al.  Discretization Techniques: A recent survey , 2006 .

[11]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[12]  Yasunari Obuchi,et al.  Emotion Recognition using Mel-Frequency Cepstral Coefficients , 2007 .

[13]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[15]  Thore Graepel,et al.  A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs work , 2000, NIPS.

[16]  David G. Stork,et al.  Pattern Classification , 1973 .

[17]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[18]  R. C. Williamson,et al.  Support vector regression with automatic accuracy control. , 1998 .

[19]  Usama M. Fayyad,et al.  On the Handling of Continuous-Valued Attributes in Decision Tree Generation , 1992, Machine Learning.

[20]  William Dumouchel,et al.  Integrating a robust option into a multiple regression computing environment , 1992 .

[21]  S. Sathiya Keerthi,et al.  Improvements to the SMO algorithm for SVM regression , 2000, IEEE Trans. Neural Networks Learn. Syst..

[22]  William H. Press,et al.  Numerical recipes in C , 2002 .

[23]  Lawrence R. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1975, Bell Syst. Tech. J..

[24]  Tim Polzehl,et al.  Emotion classification in children's speech using fusion of acoustic and linguistic features , 2009, INTERSPEECH.

[25]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[26]  Heinrich Niemann,et al.  Klassifikation von Mustern , 1983 .

[27]  Björn W. Schuller,et al.  Emotion recognition using imperfect speech recognition , 2010, INTERSPEECH.

[28]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[29]  Tim Polzehl,et al.  Anger recognition in speech using acoustic and linguistic cues , 2011, Speech Commun..

[30]  Tim Polzehl,et al.  Fusion of Acoustic and Linguistic Features for Emotion Detection , 2009, 2009 IEEE International Conference on Semantic Computing.

[31]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  William H. Press,et al.  Numerical Recipes in C, 2nd Edition , 1992 .