Extraction de dates saillantes pour la construction de chronologies thématiques

We present an approach for detecting salient (important) dates in texts in order to automatically build event timelines from a search query (e.g. the name of an event or person, etc.). This work was carried out on a corpus of newswire texts in English provided by the Agence France Presse (AFP). In order to extract salient dates that warrant inclusion in an event timeline, we first recognize and normalize temporal expressions in texts and then use a machine-learning approach to extract salient dates that relate to a particular topic. For the time being, we have focused only on extracting the dates and not the events to which they are related.

[1]  Bin Wang,et al.  A probabilistic model for retrospective news event detection , 2005, SIGIR '05.

[2]  Estela Saquete Boró,et al.  Time-Surfer: Time-Based Graphical Access to Document Content , 2011, ECIR.

[3]  Yan Zhang,et al.  Timeline Generation through Evolutionary Trans-Temporal Summarization , 2011, EMNLP.

[4]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[5]  Thorsten Brants,et al.  A System for new event detection , 2003, SIGIR.

[6]  R. Baeza-Yates,et al.  Exploratory Search Using Timelines , 2007 .

[7]  Jinwook Choi,et al.  Recognizing Temporal Information in Korean Clinical Narratives through Text Normalization , 2011, Healthcare informatics research.

[8]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[9]  Regina Barzilay,et al.  Inferring Strategies for Sentence Ordering in Multidocument News Summarization , 2002, J. Artif. Intell. Res..

[10]  Branimir Boguraev,et al.  Natural Language Engineering , 1995 .

[11]  Jean-Luc Minel,et al.  Representing and Visualizing Calendar Expressions in Texts , 2008, STEP.

[12]  Jean-Pierre Chanod,et al.  Robustness beyond shallowness: incremental deep parsing , 2002, Natural Language Engineering.

[13]  Gerhard Weikum,et al.  Timely YAGO: harvesting, querying, and visualizing temporal knowledge from Wikipedia , 2010, EDBT '10.

[14]  Sanda M. Harabagiu,et al.  Question answering based on temporal inference , 2005, AAAI 2005.

[15]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[16]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[17]  André Bittar,et al.  Temporal Annotation: A Proposal for Guidelines and an Experiment with Inter-annotator Agreement , 2012, LREC.

[18]  Nattiya Kanhabua Exploiting temporal information in retrieval of archived documents , 2009, SIGIR.

[19]  Kam-Fai Wong,et al.  A Preliminary Work on Classifying Time Granularities of Temporal Questions , 2005, IJCNLP.

[20]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[21]  Denyse Baillargeon,et al.  Bibliographie , 1929 .

[22]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[23]  Michael Gertz,et al.  Temporal Information Retrieval , 2009, Encyclopedia of Database Systems.

[24]  David A. Smith Detecting events with date and place information in unstructured text , 2002, JCDL '02.

[25]  Caroline Hagège,et al.  XTM: A Robust Temporal Text Processor , 2008, CICLing.

[26]  Jon Ølnes,et al.  Time Challenges - Challenging Times for Future Information Search , 2009, D Lib Mag..

[27]  James Allan,et al.  Topic Detection and Tracking , 2002, The Information Retrieval Series.

[28]  James Allan,et al.  Text classification and named entities for new event detection , 2004, SIGIR '04.

[29]  Pascal Denis,et al.  French TimeBank: An ISO-TimeML Annotated Reference Corpus , 2011, ACL.

[30]  Claude Roux Annoter les documents XML avec un outil d'analyse syntaxique , 2004 .

[31]  Philippe Muller,et al.  Annotation d’expressions temporelles et d’événements en français , 2008, JEPTALNRECITAL.

[32]  Hai Leong Chieu,et al.  Query based event extraction along a timeline , 2004, SIGIR '04.

[33]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[34]  James Pustejovsky,et al.  ISO-TimeML: An International Standard for Semantic Annotation , 2010, LREC.

[35]  Ning Liu,et al.  Topic Detection and Tracking , 2009, Encyclopedia of Database Systems.

[36]  James Pustejovsky,et al.  SemEval-2010 Task 13: Evaluating Events, Time Expressions, and Temporal Relations (TempEval-2) , 2009, SEW@NAACL-HLT.

[37]  James F. Allen,et al.  Event and Temporal Expression Extraction from Raw Text: First Step towards a Temporally Aware System , 2010, Int. J. Semantic Comput..

[38]  Nate Blaylock,et al.  Building Timelines from Narrative Clinical Records: Initial Results Based-on Deep Natural Language Understanding , 2011, BioNLP@ACL.

[39]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[40]  Rafael Muñoz,et al.  Enhancing QA Systems with Complex Temporal Question Processing Capabilities , 2009, J. Artif. Intell. Res..

[41]  James Pustejovsky,et al.  The Specification Language TimeML , 2005, The Language of Time - A Reader.