Detecting Fraudulent Behavior on Crowdfunding Platforms: The Role of Linguistic and Content-Based Cues in Static and Dynamic Contexts

Abstract Crowdfunding platforms offer founders the possibility to collect funding for project realization. With the advent of these platforms, the risk of fraud has risen. Fraudulent founders provide inaccurate information or pretend interest toward a project. Within this study, we propose deception detection support mechanisms to address this novel type of Internet fraud. We analyze a sample of fraudulent and nonfraudulent projects published at a leading crowdfunding platform. We examine whether the analysis of dynamic communication during the funding period is valuable for identifying fraudulent behavior—apart from analyzing only the static information related to the project. We investigate whether content-based cues and linguistic cues are valuable for fraud detection. The selection of cues and the subsequent feature engineering is based on theories in areas of communication, psychology, and computational linguistics. Our results should be helpful to the stakeholders of crowdfunding platforms and researchers of fraud detection.

[1]  Paul E. Johnson,et al.  Detecting deception: adversarial problem solving in a low base-rate world , 2001, Cogn. Sci..

[2]  Ethan Mollick The Dynamics of Crowdfunding: An Exploratory Study , 2014 .

[3]  Izak Benbasat,et al.  Product-Related Deception in E-Commerce: A Theoretical Perspective , 2011, MIS Q..

[4]  C. Fiol Corporate Communications: Comparing Executives' Private and Public Statements , 1995 .

[5]  Richard L. Daft,et al.  Organizational information requirements, media richness and structural design , 1986 .

[6]  P. Ekman,et al.  Nonverbal Leakage and Clues to Deception †. , 1969, Psychiatry.

[7]  Jay F. Nunamaker,et al.  Detecting Fake Websites: The Contribution of Statistical Learning Theory , 2010, MIS Q..

[8]  J. Burgoon,et al.  Adaptation in Dyadic Interaction: Defining and Operationalizing Patterns of Reciprocity and Compensation , 1993 .

[9]  James F. Roiger,et al.  Testing Interpersonal Deception Theory: The Language of Interpersonal Deception , 1996 .

[10]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[11]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[12]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[13]  David D. Lewis,et al.  Representation and Learning in Information Retrieval , 1991 .

[14]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[15]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[16]  Jay F. Nunamaker,et al.  Enhancing Predictive Analytics for Anti-Phishing by Exploiting Website Genre Information , 2015, J. Manag. Inf. Syst..

[17]  Rich Caruana,et al.  Getting the Most Out of Ensemble Selection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[18]  Steven A. Mccornack Information manipulation theory , 1992 .

[19]  Michael Siering,et al.  Crowdfunding Success Factors: The Characteristics of Successfully Funded Projects on Crowdfunding Platforms , 2015, ECIS.

[20]  J. Nunamaker,et al.  Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications , 2004 .

[21]  Rick L. Wilson,et al.  Decision support for determining veracity via linguistic-based cues , 2009, Decis. Support Syst..

[22]  John R. Carlson,et al.  Channel Expansion Theory and the Experiential Nature of Media Richness Perceptions , 1999 .

[23]  John Elder,et al.  Handbook of Statistical Analysis and Data Mining Applications , 2009 .

[24]  J. Burgoon,et al.  Interpersonal Deception Theory , 1996 .

[25]  L. Zhou An empirical investigation of deception behavior in instant messaging , 2005, IEEE Transactions on Professional Communication.

[26]  Kevin C. Moffitt,et al.  Identification of fraudulent financial statements using linguistic credibility analysis , 2011, Decis. Support Syst..

[27]  Christie M. Fuller,et al.  An Examination and Validation of Linguistic Constructs for Studying High-Stakes Deception , 2013 .

[28]  S. Salzberg,et al.  INSTANCE-BASED LEARNING : Nearest Neighbour with Generalisation , 1995 .

[29]  Edward F. Kelly,et al.  Computer recognition of English word senses , 1975 .

[30]  Daniel Schlagwein,et al.  Affordances and Donor Motivations in Charitable Crowdfunding : The " Earthship Kapita , 2017 .

[31]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[32]  MusílekPetr,et al.  A survey of Knowledge Discovery and Data Mining process models , 2006 .

[33]  Lukasz A. Kurgan,et al.  A survey of Knowledge Discovery and Data Mining process models , 2006, The Knowledge Engineering Review.

[34]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[35]  Eric Gilbert,et al.  The language that gets people to give: phrases that predict success on kickstarter , 2014, CSCW.

[36]  Dongsong Zhang,et al.  Following linguistic footprints: automatic deception detection in online communication , 2008, CACM.

[37]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[38]  T. Levine,et al.  When the Alteration of Information Is Viewed as Deception: An Empirical Test of Information Manipulation Theory. , 1992 .

[39]  M. Zuckerman Verbal and nonverbal communication of deception , 1981 .

[40]  J. Zittrain,et al.  Spam Works: Evidence from Stock Touts and Corresponding Market Activity , 2007 .

[41]  Daniel Schlagwein,et al.  IT Affordances and Donor Motivations in Charitable Crowdfunding: The "Earthship Kapita" Case , 2015, ECIS.

[42]  Philip J. Stone,et al.  A computer approach to content analysis: studies using the General Inquirer system , 1963, AFIPS Spring Joint Computing Conference.

[43]  Philip J. Stone,et al.  The general inquirer: A computer system for content analysis and retrieval based on the sentence as a unit of information , 2007 .

[44]  Jay F. Nunamaker,et al.  A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication , 2004, J. Manag. Inf. Syst..

[45]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[46]  Peter Gomber,et al.  How to enable automated trading engines to cope with news-related liquidity shocks? Extracting signals from unstructured data , 2014, Decis. Support Syst..

[47]  Timothy Li Fraud in Crowdfunding and Antifraud Insurance , 2013 .

[48]  Fatemeh Zahedi,et al.  Fake-Website Detection Tools: Identifying Elements that Promote Individuals' Use and Enhance Their Performance , 2015, J. Assoc. Inf. Syst..

[49]  Brent Martin,et al.  INSTANCE-B ASED LEARNING: Nearest Neighbour with Generalisation , 1995 .

[50]  Christie M. Fuller,et al.  An investigation of data and text mining methods for real world deception detection , 2011, Expert Syst. Appl..

[51]  James Joseph Biundo,et al.  Analysis of Contingency Tables , 1969 .

[52]  Stefan Pitschner,et al.  Non-profit differentials in crowd-based financing: Evidence from 50,000 campaigns , 2014 .

[53]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[54]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[55]  A. Vrij Criteria-Based Content Analysis: A Qualitative Review of the First 37 Studies. , 2005 .

[56]  Jay F. Nunamaker,et al.  Detecting Deception in Synchronous Computer-Mediated Communication Using Speech Act Profiling , 2005, ISI.

[57]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[58]  Paul E. Johnson,et al.  Fraud Detection: Intentionality and Deception in Cognition , 1993 .

[59]  Laura K. Guerrero,et al.  Interpersonal deception: XII. Information management dimensions underlying deceptive and truthful messages , 1996 .

[60]  Surya B. Yadav,et al.  A computational model for financial reporting fraud detection , 2011, Decis. Support Syst..

[61]  Marcia K. Johnson,et al.  Reality Monitoring , 2005 .

[62]  Ming Dong,et al.  How to Design Your Project in the Online Crowdfunding Market? Evidence from Kickstarter , 2014, ICIS.

[63]  Fatemeh Zahedi,et al.  Detecting Fake Medical Web Sites Using Recursive Trust Labeling , 2012, TOIS.