论文信息 - Automatic Detection of Speaker Attributes Based on Utterance Text

Automatic Detection of Speaker Attributes Based on Utterance Text

In this paper, we present models for detecting various attributes of a speaker based on uttered text alone. These attributes include whether the speaker is speaking his/her native language, the speaker’s age and gender, and the regional information reported by the speakers. We explore various lexical features as well as features inspired by Linguistic Inquiry and Word Count and Dictionary of Affect in Language. Overall, results suggest that when audio data is not available, by exploring effective feature sets only from uttered text and system combinations of multiple classification algorithms, we can build high quality statistical models to detect these attributes of speakers, comparable to systems that can exploit the audio data.

Wen Wang | Andreas Kathol | Harry Bratt

[1] Wen Wang. Weakly supervised training for parsing Mandarin broadcast transcripts , 2008, INTERSPEECH.

[2] Andreas Stolcke,et al. Detecting nonnative speech using speaker recognition approaches , 2008, Odyssey.

[3] Andreas Stolcke,et al. Nonparametric feature normalization for SVM-based speaker verification , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4] Brendan T. O'Connor,et al. A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[5] George R. Doddington,et al. Speaker recognition based on idiolectal differences between speakers , 2001, INTERSPEECH.

[6] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[7] Richard M. Schwartz,et al. Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[8] Andreas Stolcke,et al. Speaker Recognition With Session Variability Normalization Based on MLLR Adaptation Transforms , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9] Gökhan Tür,et al. Automatic disfluency removal for improving spoken language translation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.