Hoax is a current issue that is troubling the public and causes riot in various fields, ranging from politics, culture, security and order, to economics. This problem cannot be separated from the impact of rapid use of social media. As a result, every day there are thousands of information spread on social media, which is not necessarily valid, so that people are potentially exposed to hoax on social media. The hoax detection system in this study was designed with an Unsupervised Learning approach so that it did not require data training. The system is built using the Text Rank algorithm for keyword extraction and the Cosine Similarity algorithm to calculate the level of document similarity. The keyword extraction results will be used to search for content related to input from users using the search engine, then calculate the similarity value. If the related content tends to come from trusted media, then the content is potentially factual. Likewise, if the related content tends to be published by unreliable media, then there is the potential for hoax. The hoax detection system has been tested using confusion matrix, from 20 news content data consisting of 10 correct issues and 10 wrong issues. Then the system produces a classification with details of 13 issues including wrong and 7 issues including true, then the number of classifications that match the original label are 15 issues. Based on the results of the classification, an accuracy value of 75% was obtained.
[1]
Ronen Feldman,et al.
The Text Mining Handbook: DIAL: A Dedicated Information Extraction Language for Text Mining
,
2006
.
[2]
Vibriza Juliswara,et al.
Mengembangkan Model Literasi Media yang Berkebhinnekaan dalam Menganalisis Informasi Berita Palsu (Hoax) di Media Sosial
,
2017
.
[3]
Rada Mihalcea,et al.
TextRank: Bringing Order into Text
,
2004,
EMNLP.
[4]
Aida Indriani,et al.
Klasifikasi Data Forum dengan menggunakan Metode Naïve Bayes Classifier
,
2014
.
[5]
Errissya Rasywir,et al.
Eksperimen pada Sistem Klasifikasi Berita Hoax Berbahasa Indonesia Berbasis Pembelajaran Mesin
,
2016
.
[6]
Kusrini,et al.
Classification method of multi-class on C4.5 algorithm for fish diseases
,
2016,
2016 2nd International Conference on Science in Information Technology (ICSITech).