A Rule-Based System for Monitoring of Microblogging Disease Reports

Real-time microblogging messages are an interesting data source for the realization of early warning systems that track the outbreaks of epidemic diseases. Microblogging monitoring systems might be able to detect disease outbreaks in communities faster than the traditional public health services. The realization of such systems requires a message classification approach that can distinguish the messages which concern diseases from other unrelated messages. The existing machine learning classification approaches have some difficulties due to the lack of a longer history-based learning curve and the short length of the messages. In this paper, we present a demonstration of our rule-based approach for classification of disease reports. Our system is built based on the extraction of disease-related named entities. The type identification of the recognized named entities using the existing knowledge bases helps our system to classify a message as a disease report. We combine our approach with further text processing approaches like term frequency calculation. Our experimental results show that the presented approach is capable of classifying the disease report messages with acceptable precision and recall.

[1]  L. Hutwagner,et al.  The bioterrorism preparedness and response Early Aberration Reporting System (EARS) , 2003, Journal of Urban Health.

[2]  Avare Stewart,et al.  Epidemic Intelligence for the Crowd, by the Crowd , 2012, ICWSM.

[3]  Huan Liu,et al.  Enhancing accessibility of microblogging messages using semantic knowledge , 2011, CIKM '11.

[4]  Gustavo Rossi,et al.  Web Engineering , 2001, Lecture Notes in Computer Science.

[5]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[6]  Michael M. Wagner,et al.  Handbook of biosurveillance , 2006 .

[7]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[8]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[9]  Michèle Basseville,et al.  Detection of abrupt changes: theory and application , 1993 .

[10]  Lars Kai Hansen,et al.  Good Friends, Bad News - Affect and Virality in Twitter , 2011, ArXiv.

[11]  Nello Cristianini,et al.  Tracking the flu pandemic by monitoring the social web , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[12]  Cynthia Chew Pandemics in the Age of Twitter: A Content Analysis of the 2009 H1N1 Outbreak , 2010 .

[13]  alessio-signorini Social Web Information Monitoring for Health , 2009 .

[14]  Sharib A. Khan Handbook of Biosurveillance, M.M. Wagner, A.W. Moore, R.M. Aryel (Eds.). Elsevier Inc. ISBN-13: 978-0-12-369378-5 , 2007, J. Biomed. Informatics.

[15]  Jeff Heflin,et al.  The Semantic Web – ISWC 2012 , 2012, Lecture Notes in Computer Science.