A Multi-Aspect Classification Ensemble Approach for Profiling Fake News Spreaders on Twitter

In this work, we attempt to differentiate authors of fake news and real news as part of the Profiling Fake News Spreaders on Twitter task at PAN. We propose a set of eight different language features to represent tweets. These representations are subsequently used in an ensemble classification model to identify fake news spreaders on Twitter. The approach is confined to the English language.

[1]  Tadayoshi Fushiki,et al.  Estimation of prediction error by using K-fold cross-validation , 2011, Stat. Comput..

[2]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[3]  Paul Nemitz,et al.  Constitutional democracy and technology in the age of artificial intelligence , 2018, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[4]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[5]  J. Chall,et al.  Readability revisited : the new Dale-Chall readability formula , 1995 .

[6]  Po-Ya Angela Wang #Irony or #Sarcasm — A Quantitative and Qualitative Study Based on Twitter , 2013, PACLIC.

[7]  Robert E. Schapire,et al.  Explaining AdaBoost , 2013, Empirical Inference.

[8]  Benno Stein,et al.  TIRA Integrated Research Architecture , 2019, Information Retrieval Evaluation in a Changing World.

[9]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[10]  Gerhard Weikum,et al.  DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning , 2018, EMNLP.

[11]  Walt Detmar Meurers,et al.  On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition , 2012, BEA@NAACL-HLT.

[12]  Adam E M Eltorai,et al.  Readability of Invasive Procedure Consent Forms , 2015, Clinical and translational science.

[13]  M. L. Stein,et al.  How to write plain English , 1975 .

[14]  Grigorios Tsoumakas,et al.  An Ensemble Pruning Primer , 2009, Applications of Supervised and Unsupervised Ensemble Methods.

[15]  R. Gunning The Fog Index After Twenty Years , 1969 .

[16]  Gavin Brown,et al.  Ensemble Learning , 2010, Encyclopedia of Machine Learning and Data Mining.

[17]  Paolo Rosso,et al.  Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter , 2020, CLEF.

[18]  Rong Jin,et al.  Understanding bag-of-words model: a statistical framework , 2010, Int. J. Mach. Learn. Cybern..

[19]  M. D. Rijke,et al.  Information Retrieval Evaluation in a Changing World: Lessons Learned from 20 Years of CLEF , 2019, Information Retrieval Evaluation in a Changing World.

[20]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[21]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[22]  Paolo Rosso,et al.  An Emotional Analysis of False Information in Social Media and News Articles , 2019, ACM Trans. Internet Techn..

[23]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[24]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.