Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection

Improving methods of automatic deception detection is an important goal of many researchers from a variety of disciplines, including psychology, computational linguistics, and criminology. We present a system to automatically identify deceptive utterances using acoustic-prosodic, lexical, syntactic, and phonotactic features. We train and test our system on the Interspeech 2016 ComParE challenge corpus, and find that our combined features result in performance well above the challenge baseline on the development data. We also perform feature ranking experiments to evaluate the usefulness of each of our feature sets. Finally, we conduct a cross-corpus evaluation by training on another deception corpus and testing on the ComParE corpus.

[1]  J. Pennebaker,et al.  Lying Words: Predicting Deception from Linguistic Styles , 2003, Personality & social psychology bulletin.

[2]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[3]  James J. Lindsay,et al.  Cues to deception. , 2003, Psychological bulletin.

[4]  L A Streeter,et al.  Pitch changes during attempted deception. , 1977, Journal of personality and social psychology.

[5]  Anastassia Loukina,et al.  Using F0 contours to assess nativeness in a sentence repeat task , 2015, INTERSPEECH.

[6]  Martha E. Francis,et al.  Journal of Personality and Social Psychology Linguistic Predictors of Adaptive Bereavement , 2022 .

[7]  Eduardo Coutinho,et al.  The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language , 2016, INTERSPEECH.

[8]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[9]  Andreas Stolcke,et al.  Distinguishing deceptive from non-deceptive speech , 2005, INTERSPEECH.

[10]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[11]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[12]  Sarah Ita Levitan,et al.  Individual Differences in Deception and Deception Detection , 2015 .

[13]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[14]  Julia Hirschberg,et al.  Detecting deception in speech , 2009 .

[15]  J. Nunamaker,et al.  Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications , 2004 .

[16]  R. Gur,et al.  Telling truth from lie in individual subjects with fast event‐related fMRI , 2005, Human brain mapping.

[17]  Paul Ekman,et al.  Lie Catching and Microexpressions , 2009 .

[18]  Gary Geunbae Lee,et al.  Information gain and divergence-based feature selection for machine learning-based text categorization , 2006, Inf. Process. Manag..

[19]  Klaus R. Scherer,et al.  Invited article: Face, voice, and body in detecting deceit , 1991 .

[20]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[21]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[22]  Matthew L. Jensen,et al.  Deception detection through automatic, unobtrusive analysis of nonverbal behavior , 2005, IEEE Intelligent Systems.

[23]  Rudi C. Villing,et al.  Automatic Blind Syllable Segmentation for Continuous Speech , 2004 .

[24]  Eileen Fitzpatrick,et al.  Verification and Implementation of Language-Based Deception Indicators in Civil and Criminal Narratives , 2008, COLING.

[25]  Andrew Rosenberg,et al.  AutoBI - a tool for automatic toBI annotation , 2010, INTERSPEECH.

[26]  Julia Hirschberg,et al.  Cross-Cultural Production and Detection of Deception from Speech , 2015, WMDD@ICMI.

[27]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[28]  C. Whissell,et al.  A Dictionary of Affect in Language: IV. Reliability, Validity, and Applications , 1986 .

[29]  Xiaodong Cui,et al.  Improving deep neural network acoustic modeling for audio corpus indexing under the IARPA babel program , 2014, INTERSPEECH.

[30]  Mattias Heldner,et al.  The fundamental frequency variation spectrum , 2008 .

[31]  Carlo Strapparava,et al.  The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language , 2009, ACL.

[32]  J. Pennebaker,et al.  Linguistic styles: language use as an individual difference. , 1999, Journal of personality and social psychology.