Study of hoax news detection using naïve bayes classifier in Indonesian language

Nowadays internet has been well known as an information source with many form including online news articles. People mostly search news in the internet. Online news articles are spreading on websites. Those articles' validity may both authentic and fake. Fake news article usually called as hoax news. Hoax news may lead the readers to feel burdened, provoked or even be in a loss. This research proposes to build an automatic hoax news detection. The research describe about hoax news article detection in Indonesian language. This research using own dataset on 250 pages of hoax and valid news articles. Three reviewers conduct manual classification for this purpose. Final tagging are obtained by voting of those three reviewers. Based on three times randomly on training and testing datasets using php-ml component library's obtained average highest on 70% training set and 30% testing set with accuracy is 78,6%, hoax precision is 67,1% valid precision is 91.6%, hoax recall is 89,4% and valid recall is 71,4. This dataset is openly so future research can replicate of dataset and comparison of the result and baseline testing.