Fake News Detection on Fake.Br Using Hierarchical Attention Networks

Automatic fake news detection is a challenging problem in natural language processing, and contributions in this field may induce immense social impacts. This article examines the use of Hierarchical Attention Network (HAN) as a method for automatic fake news detection. We evaluate the proposed models in the Brazilian Portuguese fake news parallel corpus Fake.Br using its original full text, and also in the truncated version. We run the HAN varying the size of word embedding from 100 to 600, and by maintaining and removing the stop words. This method achieved an accuracy of 97% for full texts using the word embedding size of 600 from GloVe. However, when comparing running this method for truncated texts, this method presents similar results (90% accuracy) to the baseline established by the simple machine learning methods presented in the original presentation work of the Fake.Br (89% accuracy). Overall, keeping or removing stop words and varying the size of the word embeddings also shows a negligible advantage.