Event extraction approach for Web 2.0

Event extraction is a significant task in information extraction. This importance increases more and more with the explosion of textual data available on the Web, the appearance of Web 2.0 and the tendency towards the Semantic Web. Thus, we propose a generic approach to extract events from text and to analyze them. We propose an event extraction algorithm with a polynomial complexity O(n5), and a new similarity measurement between events. We use this measurement to gather similar events. We also present a semantic map of events, and we validate the first component of our approach by the development of the “EventEC” system.

[1]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Approach to Identifying Sentence Boundaries , 1997, ANLP.

[2]  Christiane Fellbaum,et al.  Performance And Confidence In A Semantic Annotation Task , 1998 .

[3]  Philippe Smets The transferable belief model and other interpretations of Dempster-Shafer's model , 1990, UAI.

[4]  Jin-Dong Kim,et al.  Building Patterns for Biomedical Event Extraction , 2004 .

[5]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[6]  Ghassan Mourad,et al.  Analyse informatique des signes typographiques pour la segmentation de textes et l'extraction automatique de citations : réalisation des applications informatiques : SegATex et CitaRE , 2001 .

[7]  André Bittar Annotation des informations temporelles dans des textes en français. , 2008 .

[8]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.

[9]  Branimir Boguraev,et al.  TimeBank-Driven TimeML Analysis , 2005, Annotating, Extracting and Reasoning about Time and Events.

[10]  Marti A. Hearst,et al.  Adaptive Sentence Boundary Disambiguation , 1994, ANLP.

[11]  Ghassan Mourad La segmentation de textes par exploration contextuelle automatique, présentation du module SegATex , 2002 .

[12]  Ying Chen,et al.  Automatic Time Expression Labeling for English and Chinese Text , 2005, CICLing.

[13]  Brigitte Escofier,et al.  Analyse factorielle et distances répondant au principe d'équivalence distributionnelle , 1978 .

[14]  SaltonGerard,et al.  Term-weighting approaches in automatic text retrieval , 1988 .

[15]  Nicholas Kushmerick,et al.  Event Extraction from Heterogeneous News Sources , 2006 .

[16]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[17]  Rim Faiz Identifying Relevant Sentences in News Articles for Event Information Extraction , 2006, Int. J. Comput. Process. Orient. Lang..

[18]  Yasuyuki Matsushita,et al.  An Intensity Similarity Measure in Low-Light Conditions , 2006, ECCV.

[19]  Robert Dale,et al.  The DANTE Temporal Expression Tagger , 2009, LTC.

[20]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .