Computational Linguistic Models of Deceptive Opinion Spam

[1]  P. Ekman,et al.  Nonverbal leakage and clues to deception. , 1969, Psychiatry.

[2]  M. Spence Job Market Signaling , 1973 .

[3]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[4]  D. O. Sears College sophomores in the laboratory: Influences of a narrow data base on social psychology's view of human nature. , 1986 .

[5]  Donal E. Carlston,et al.  Negativity and extremity biases in impression formation: A review of explanations. , 1989 .

[6]  L. Joseph,et al.  Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. , 1995, American journal of epidemiology.

[7]  Stephen Porter,et al.  The language of deceit: An investigation of the verbal clues to deception in the interrogation context , 1996 .

[8]  P. Ekman,et al.  The ability to detect deceit generalizes across different types of high-stake lies. , 1997, Journal of personality and social psychology.

[9]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[10]  Marcia K. Johnson,et al.  False memories and confabulation , 1998, Trends in Cognitive Sciences.

[11]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[12]  Mark A. deTurck,et al.  The Behavioral Correlates of Sanctioned and Unsanctioned Deceptive Communication , 1998 .

[13]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[14]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[15]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[16]  Jason D. M. Rennie,et al.  Improving Multiclass Text Classification with the Support Vector Machine , 2001 .

[17]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[18]  W O Johnson,et al.  Screening without a "gold standard": the Hui-Walter paradigm revisited. , 2001, American journal of epidemiology.

[19]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[20]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[21]  Marco Saerens,et al.  Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure , 2002, Neural Computation.

[22]  Shlomo Argamon,et al.  Automatically Categorizing Written Texts by Author Gender , 2002, Lit. Linguistic Comput..

[23]  Geoffrey Leech,et al.  Grammatical word class variation within the British National Corpus sampler , 2002 .

[24]  James J. Lindsay,et al.  Cues to deception. , 2003, Psychological bulletin.

[25]  J. Pennebaker,et al.  Lying Words: Predicting Deception from Linguistic Styles , 2003, Personality & social psychology bulletin.

[26]  Dale Schuurmans,et al.  Language independent authorship attribution using character level language models , 2003, Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03.

[27]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[28]  J. Pennebaker,et al.  Psychological aspects of natural language. use: our words, our selves. , 2003, Annual review of psychology.

[29]  Dale Schuurmans,et al.  Combining Naive Bayes and n-Gram Language Models for Text Classification , 2003, ECIR.

[30]  Jeffrey T. Hancock,et al.  Deception and design: the impact of communication technology on lying behavior , 2004, CHI.

[31]  Dwayne D. Gremler,et al.  Electronic word-of-mouth via consumer-opinion platforms: What motivates consumers to articulate themselves on the Internet? , 2004 .

[32]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[33]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[34]  Jay F. Nunamaker,et al.  A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication , 2004, J. Manag. Inf. Syst..

[35]  R. Rigby,et al.  Generalized additive models for location, scale and shape , 2005 .

[36]  Moshe Koppel,et al.  Determining an author's native language by mining a text for errors , 2005, KDD '05.

[37]  Gilad Mishne,et al.  Blocking Blog Spam with Language Model Disagreement , 2005, AIRWeb.

[38]  Richard Simon,et al.  Bias in error estimation when using cross-validation for model selection , 2006, BMC Bioinformatics.

[39]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[40]  Marcia K. Johnson,et al.  Reality Monitoring , 2005 .

[41]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[42]  B. Depaulo,et al.  Accuracy of Deception Judgments , 2006, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.

[43]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[44]  Luca Becchetti,et al.  A reference collection for web spam , 2006, SIGF.

[45]  Marc Najork,et al.  Detecting spam web pages through content analysis , 2006, WWW '06.

[46]  Jeffrey T. Hancock,et al.  On Lying and Being Lied To: A Linguistic Analysis of Deception in Computer-Mediated Communication , 2007 .

[47]  A. Vrij,et al.  Cues to Deception and Ability to Detect Lies as a Function of Police Interview Styles , 2007, Law and human behavior.

[48]  Cindy K. Chung,et al.  The development and psychometric properties of LIWC2007 , 2007 .

[49]  Max Mühlhäuser,et al.  Automatically Assessing the Post Quality in Online Discussions on Software , 2007, ACL.

[50]  Marilyn A. Walker,et al.  Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text , 2007, J. Artif. Intell. Res..

[51]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[52]  Douglas Aberdeen,et al.  The War Against Spam: A report from the front line , 2007 .

[53]  A. Vrij Detecting Lies and Deceit: Pitfalls and Opportunities , 2008 .

[54]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[55]  Ling Liu,et al.  Do online reviews affect product sales? The role of reviewer characteristics and temporal effects , 2008, Inf. Technol. Manag..

[56]  M. de Rijke,et al.  Credibility Improves Topical Blog Post Retrieval , 2008, ACL.

[57]  Dongsong Zhang,et al.  A Statistical Language Modeling Approach to Online Deception Detection , 2008, IEEE Transactions on Knowledge and Data Engineering.

[58]  Barry Smyth,et al.  Learning to recommend helpful hotel reviews , 2009, RecSys '09.

[59]  Carlo Strapparava,et al.  The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language , 2009, ACL.

[60]  Kyung Hyan Yoo,et al.  Comparison of Deceptive and Truthful Travel Reviews , 2009, ENTER.

[61]  Jon M. Kleinberg,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Opinions How Opinions are Received by Online Communities: A Case Study on Amazon.com Helpfulness Votes , 2022 .

[62]  A. Vrij,et al.  Outsmarting the Liars: The Benefit of Asking Unanticipated Questions , 2009, Law and human behavior.

[63]  Filippo Menczer,et al.  Modeling Statistical Properties of Written Text , 2009, PloS one.

[64]  Cheol Park,et al.  Information direction, website reputation and eWOM effect: A moderating role of product type , 2009 .

[65]  Erik Qualman Socialnomics: How Social Media Transforms the Way We Live and Do Business , 2009 .

[66]  Mira Lee,et al.  Effects of Valence and Extremity of eWOM on Attitude toward the Brand and Website , 2009 .

[67]  Eric Gilbert,et al.  Widespread Worry and the Stock Market , 2010, ICWSM.

[68]  Jeremy P. Birnholtz,et al.  "on my way": deceptive texting and interpersonal awareness narratives , 2010, CSCW '10.

[69]  J. Pete Blair,et al.  (In)accuracy at Detecting True and False Confessions and Denials: An Initial Test of a Projected Motive Model of Veracity Judgments , 2010 .

[70]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[71]  Eni Mustafaraj,et al.  From Obscurity to Prominence in Minutes: Political Speech and Real-Time Search , 2010 .

[72]  Adriana Kovashka,et al.  Authorship Attribution Using Probabilistic Context-Free Grammars , 2010, ACL.

[73]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[74]  Andrew Olney,et al.  An Exploration of Off Topic Conversation , 2010, NAACL.

[75]  Derek Greene,et al.  Merging multiple criteria to identify suspicious reviews , 2010, RecSys '10.

[76]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[77]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[78]  Rada Mihalcea,et al.  Amazon Mechanical Turk for Subjectivity Word Sense Disambiguation , 2010, Mturk@HLT-NAACL.

[79]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[80]  T. Levine,et al.  Content in Context Improves Deception Detection Accuracy , 2010 .

[81]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[82]  Derek Greene,et al.  Distortion as a validation criterion in the identification of suspicious reviews , 2010, SOMA '10.

[83]  George Forman,et al.  Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement , 2010, SKDD.

[84]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[85]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[86]  Arjun Mukherjee,et al.  Improving Gender Classification of Blog Authors , 2010, EMNLP.

[87]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[88]  Panagiotis G. Ipeirotis Demographics of Mechanical Turk , 2010 .

[89]  Aron Culotta,et al.  Towards detecting influenza epidemics by analyzing Twitter messages , 2010, SOMA '10.

[90]  X. Zhang,et al.  Impact of Online Consumer Reviews on Sales: The Moderating Role of Product and Consumer Characteristics , 2010 .

[91]  Ana-Maria Popescu,et al.  A Machine Learning Approach to Twitter User Classification , 2011, ICWSM.

[92]  Carolyn Penstein Rosé,et al.  Author Age Prediction from Text using Linear Regression , 2011, LaTeCH@ACL.

[93]  Eric P. Xing,et al.  Sparse Additive Generative Models of Text , 2011, ICML.

[94]  Claire Cardie,et al.  Multi-aspect Sentiment Analysis with Topic Models , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[95]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[96]  Yejin Choi,et al.  Domain Independent Authorship Attribution without Domain Adaptation , 2011, RANLP.

[97]  Yejin Choi,et al.  Gender Attribution: Tracing Stylometric Evidence Beyond Topic and Genre , 2011, CoNLL.

[98]  Jacob Ratkiewicz,et al.  Detecting and Tracking Political Abuse in Social Media , 2011, ICWSM.

[99]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[100]  Jacob Ratkiewicz,et al.  Predicting the Political Alignment of Twitter Users , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[101]  Eric P. Xing,et al.  Discovering Sociolinguistic Associations with Structured Sparsity , 2011, ACL.

[102]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[103]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[104]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[105]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[106]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[107]  Yejin Choi,et al.  Distributional Footprints of Deceptive Product Reviews , 2012, ICWSM.

[108]  Jeffrey T. Hancock,et al.  What Lies Beneath: The Linguistic Traces of Deception in Online Dating Profiles , 2012 .

[109]  Subhash C. Kak,et al.  A Survey of Prediction Using Social Media , 2012, ArXiv.

[110]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[111]  Beng Soo Ong,et al.  The Perceived Influence of User Reviews in the Hospitality Industry , 2012 .

[112]  Michael L. Anderson,et al.  Learning from the Crowd: Regression Discontinuity Estimates of the Effects of an Online Review Database , 2012 .

[113]  Claire Cardie,et al.  Estimating the prevalence of deception in online review communities , 2012, WWW.

[114]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[115]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.

[116]  Mung Chiang,et al.  Why watching movie tweets won't tell the whole story? , 2012, WOSN '12.

[117]  Munmun De Choudhury,et al.  Not All Moods Are Created Equal! Exploring Human Emotional States in Social Media , 2012, ICWSM.

[118]  Aristides Gionis,et al.  Correlating financial time series with micro-blogging activity , 2012, WSDM '12.

[119]  Yejin Choi,et al.  Characterizing Stylistic Elements in Syntactic Structure , 2012, EMNLP.

[120]  Dina Mayzlin,et al.  Promotional Reviews: An Empirical Investigation of Online Review Manipulation , 2012 .

[121]  Scott Counts,et al.  Tweeting is believing?: understanding microblog credibility perceptions , 2012, CSCW.

[122]  Daniel Gayo-Avello,et al.  "I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper" - A Balanced Survey on Election Prediction using Twitter Data , 2012, ArXiv.

[123]  Carolyn Penstein Rosé,et al.  Detecting offensive tweets via topical feature discovery over a large scale twitter corpus , 2012, CIKM.

[124]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[125]  Adam J. Berinsky,et al.  Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk , 2012, Political Analysis.

[126]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[127]  Matthias Hagen,et al.  Overview of the 1st international competition on plagiarism detection , 2009 .

[128]  J. Burgoon,et al.  Interpersonal Deception Theory , 2015 .