Suiciderelated Text Classification With Prism Algorithm

Raw but valuable user data is continuously being generated on social media platforms. This data is, however, more valuable when they are mined using different approaches such as machine learning techniques. Additionally, this user-generated data can be used to potentially save lives especially of vulnerable social media users, as several studies carried out have shown the correlation between social media and suicide. In this study, we aim at contributing to the research relating to suicide communication on social media. We measured the performance of five machine learning algorithms: Prism, Decision Tree, Naï;ve Bayes, Random Forest and Support Vector Machine, in classifying suicide-related text from Twitter. The results of the study showed that the Prism algorithm has outperformed the other machine learning algorithms with an F-measure of 0.84 for the target classes (Suicide and Flippant). This result, to the best of our knowledge, is the highest performance that has been achieved in classifying social media suicide-related text.

[1]  Yong Shi,et al.  The Role of Text Pre-processing in Sentiment Analysis , 2013, ITQM.

[2]  Yalin Baştanlar,et al.  Introduction to machine learning. , 2014, Methods in molecular biology.

[3]  William Stafford Noble,et al.  Machine learning applications in genetics and genomics , 2015, Nature Reviews Genetics.

[4]  Yixin Yin,et al.  An Application on Text Classification Based on Granular Computing , 2007 .

[5]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[6]  Han Liu,et al.  Transformation of discriminative single-task classification into generative multi-task classification in machine learning context , 2017, 2017 Ninth International Conference on Advanced Computational Intelligence (ICACI).

[7]  David S. Wishart,et al.  Applications of Machine Learning in Cancer Prediction and Prognosis , 2006, Cancer informatics.

[8]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[9]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[10]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[11]  Pete Burnap,et al.  Machine Classification and Analysis of Suicide-Related Communication on Twitter , 2015, HT.

[12]  Min Chen,et al.  DETECTION OF SOCCER GOAL SHOTS USING JOINT MULTIMEDIA FEATURES AND CLASSIFICATION RULES , 2003 .

[13]  Yang Liu,et al.  Multi-class sentiment classification: The experimental comparisons of feature selection and machine learning algorithms , 2017, Expert Syst. Appl..

[14]  Han Liu,et al.  Multi-task learning for intelligent data processing in granular computing context , 2018 .

[15]  Fadi A. Thabtah,et al.  Intelligent phishing detection system for e-banking using fuzzy data mining , 2010, Expert Syst. Appl..

[16]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[17]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[18]  L. Flashman,et al.  Predicting the Risk of Suicide by Analyzing the Text of Clinical Notes , 2014, PloS one.

[19]  Theodore B. Trafalis,et al.  Support vector machine for regression and applications to financial forecasting , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[20]  Christian P. Robert,et al.  Machine Learning, a Probabilistic Perspective , 2014 .

[21]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[22]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[23]  Moacir P. Ponti Jr. Combining Classifiers: From the Creation of Ensembles to the Decision Fusion , 2011, 2011 24th SIBGRAPI Conference on Graphics, Patterns, and Images Tutorials.