Socio-Political Event Extraction Using a Rule-Based Approach

News reports are currently one of the most studied data sources in the field of information extraction. Event descriptions that come from these sources are controversial, complementary and reflect relationships between the participating entities. The aim of the present work is to test a group of predefined patterns and rules to obtain sets of automatically filled scenario templates for socio-political events case study: protests and to apply clustering algorithms. At this stage the information is extracted from Russian news titles. The results of the pattern quality assessment and clustering are presented.

[1]  Roman Grundkiewicz,et al.  Automatic Extraction of Polish Language Errors from Text Edition History , 2013, TSD.

[2]  Uzay Kaymak,et al.  An Overview of Event Extraction from Text , 2011, DeRiVE@ISWC.

[3]  Benno Stein,et al.  Analysis of Clustering Algorithms for Web-Based Search , 2002, PAKM.

[4]  Jakub Piskorski,et al.  Online News Event Extraction for Global Crisis Surveillance , 2011, Trans. Comput. Collect. Intell..

[5]  Jakub Piskorski,et al.  Multilingual Real-time Event Extraction for Border Security Intelligence Gathering , 2011, Counterterrorism and Open Source Intelligence.

[6]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[7]  Mikhail Kopotev,et al.  Building Support Tools for Russian-Language Information Extraction , 2011, TSD.

[8]  Nicolás García-Pedrajas,et al.  Trends in Applied Intelligent Systems - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Cordoba, Spain, June 1-4, 2010, Proceedings, Part I , 2010, IEA/AIE.

[9]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[10]  Uffe Kock Wiil,et al.  Counterterrorism and Open Source Intelligence , 2011, Counterterrorism and Open Source Intelligence.

[11]  Paolo Rosso,et al.  ITSA * : An Effective Iterative Method for Short-Text Clustering Tasks , 2010, IEA/AIE.

[12]  Benno Stein,et al.  On Cluster Validity and the Information Need of Users , 2003 .

[13]  Dan Braha,et al.  Global Civil Unrest: Contagion, Self-Organization, and Prediction , 2012, PloS one.

[14]  Kim Schouten,et al.  Semantics-based information extraction for detecting economic events , 2012, Multimedia Tools and Applications.

[15]  Flavius Frasincar,et al.  Web Semantics: Science, Services and Agents on the World Wide Web , 2012 .

[16]  Jakub Piskorski,et al.  Information Extraction: Past, Present and Future , 2013, Multi-source, Multilingual Information Extraction and Summarization.

[17]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[18]  Gaël Lejeune Structure patterns in Information Extraction: a multilingual solution? , 2009 .

[19]  Paolo Rosso,et al.  A Self-enriching Methodology for Clustering Narrow Domain Short Texts , 2011, Comput. J..