Fake News Detection Using One-Class Classification

Fake news have attracted attention of general public because of the influence they can exert on important activities of society, such as elections. Efforts have been made to detect them, but usually they rely on human labour fact-checking, what can be costly and time consuming. Computational approaches have typically relied on supervised learning models, in which a model is trained based on fake and true news samples. Such approach allows a large amount of news to be classified in a short time, but it demands datasets labelled with positive and negative instances. Our work proposes to detect fake news by training a model with only fake samples in the training dataset, through One-Class Classification (OCC). We compare a novel algorithm, called DCDistanceOCC, to others published in literature, and got similar, or even better, results. The case study is the Brazilian politics scenario starting at the 2018 general elections on Twitter and WhatsApp. These two platforms were a fertile ground to fake news proliferation. We also evaluated the models over another available dataset from literature. To the best of our knowledge, this is the first paper to identify fake news using an OCC approach and also the first one to provide Portuguese-based WhatsApp and Twitter datasets with fake news.

[1]  Robert P. W. Duin,et al.  Pump Failure Detection Using Support Vector Data Descriptions , 1999, IDA.

[2]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[3]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[4]  Fabrício Olivetti de França,et al.  DCDistance: A Supervised Text Document Feature extraction based on class labels , 2018, ArXiv.

[5]  Tiago A. Almeida,et al.  Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results , 2018, PROPOR.

[6]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[7]  Yang Liu,et al.  Early Detection of Fake News on Social Media Through Propagation Path Classification with Recurrent and Convolutional Networks , 2018, AAAI.

[8]  Issa Traore,et al.  Detecting opinion spams and fake news using text classification , 2018, Secur. Priv..

[9]  Jiawei Han,et al.  Evaluating Event Credibility on Twitter , 2012, SDM.

[10]  Huan Liu,et al.  Tracing Fake-News Footprints: Characterizing Social Media Messages by How They Propagate , 2018, WSDM.

[11]  Heiko Paulheim,et al.  Weakly Supervised Learning for Fake News Detection on Twitter , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[12]  Yongdong Zhang,et al.  News Verification by Exploiting Conflicting Social Viewpoints in Microblogs , 2016, AAAI.

[13]  Fan Yang,et al.  Automatic detection of rumor on Sina Weibo , 2012, MDS '12.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[16]  Shehroz S. Khan,et al.  One-class classification: taxonomy of study and review of techniques , 2013, The Knowledge Engineering Review.

[17]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.