Applying semantic knowledge to the automatic processing of temporal expressions and events in natural language

This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.

[1]  David R. Dowty Thematic proto-roles and argument selection , 1991 .

[2]  Leon Derczynski,et al.  USFD2: Annotating Temporal Expresions and TLINKs for TempEval-2 , 2010, *SEMEVAL.

[3]  James F. Allen,et al.  TRIPS and TRIOS System for TempEval-2: Extracting Temporal Information from Text , 2010, *SEMEVAL.

[4]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[5]  Paloma Moreda,et al.  Corpus-based semantic role approach in information retrieval , 2007, Data Knowl. Eng..

[6]  Rafael Muñoz,et al.  Automatic Multilinguality for Time Expression Resolution , 2004, MICAI.

[7]  Estela Saquete Boró,et al.  Temporal Expression Identification Based on Semantic Roles , 2009, NLDB.

[8]  B. Navarro,et al.  Syntactic , semantic and pragmatic annotation in Cast 3 LB , 2003 .

[9]  James Pustejovsky,et al.  Evita: A Robust Event Recognizer For QA Systems , 2005, HLT.

[10]  Estela Saquete Boró,et al.  ID 392: TERSEO + T2T3 Transducer. A systems for Recognizing and Normalizing TIMEX3 , 2010, SemEval@ACL.

[11]  Estela Saquete Boró,et al.  TimeML Events Recognition and Classification: Learning CRF Models with Semantic Roles , 2010, COLING.

[12]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[13]  Estela Saquete Boró,et al.  Using Semantic Networks to Identify Temporal Expressions from Semantic Roles , 2009, RANLP.

[14]  Sivaji Bandyopadhyay,et al.  JU_CSE_TEMP: A First Step towards Evaluating Events, Time Expressions and Temporal Relations , 2010, *SEMEVAL.

[15]  Michael Gertz,et al.  HeidelTime: High Quality Rule-Based Extraction and Normalization of Temporal Expressions , 2010, *SEMEVAL.

[16]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.

[17]  Dragomir R. Radev,et al.  Sub-event based multi-document summarization , 2003, HLT-NAACL 2003.

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[19]  Marie-Francine Moens,et al.  KUL: Recognition and Normalization of Temporal Expressions , 2010, SemEval@ACL.

[20]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[21]  Charles J. Fillmore,et al.  Types of Lexical Information , 1969 .

[22]  Robert J. Gaizauskas,et al.  Annotating Events and Temporal Information in Newswire Texts , 2000, LREC.

[23]  Matteo Negri,et al.  Recognition and Normalization of TimeExpressions : ITC-irst at TERN 2004 , 2005 .

[24]  Marie-Francine Moens,et al.  Model-Portability Experiments for Textual Temporal Analysis , 2011, ACL.

[25]  Paloma Martínez,et al.  UC3M System: Determining the Extent, Type and Value of Time Expressions in TempEval-2 , 2010, SemEval@ACL.

[26]  Estela Saquete Boró,et al.  Combining semantic information in question answering systems , 2011, Inf. Process. Manag..

[27]  Emmon W. Bach,et al.  Universals in Linguistic Theory , 1970 .

[28]  Branimir Boguraev,et al.  Effective Use of TimeBank for TimeML Analysis , 2005, Annotating, Extracting and Reasoning about Time and Events.

[29]  Beatrice Alex,et al.  Edinburgh-LTG: TempEval-2 System Description , 2010, *SEMEVAL.

[30]  Estela Saquete Boró,et al.  TIPSem (English and Spanish): Evaluating CRFs and Semantic Roles in TempEval-2 , 2010, *SEMEVAL.

[31]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[32]  Michael Gertz,et al.  On the value of temporal information in information retrieval , 2007, SIGF.

[33]  James Pustejovsky,et al.  Automating Temporal Annotation with TARSQI , 2005, ACL.

[34]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[35]  Tommaso Caselli,et al.  SemEval-2010 Task 13: TempEval-2 , 2010, *SEMEVAL.

[36]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[37]  Estela Saquete,et al.  ID 392: TERSEO + T2T3 Transducer: a systems for recognizing and normalizing TIMEX3 , 2010 .

[38]  M. de Rijke,et al.  A Cascaded Machine Learning Approach to Interpreting Temporal Expressions , 2007, NAACL.

[39]  Timothy Baldwin,et al.  Automatic Event Reference Identification , 2008, ALTA.

[40]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[41]  Rafael Muñoz,et al.  Enhancing QA Systems with Complex Temporal Question Processing Capabilities , 2009, J. Artif. Intell. Res..

[42]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[43]  James H. Martin,et al.  Identification of Event Mentions and their Semantic Class , 2006, EMNLP.

[44]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[45]  Dan Roth,et al.  Semantic Role Labeling Via Generalized Inference Over Classifiers , 2004, CoNLL.