Temporal Document Retrieval Model for business news archives

Temporal expressions occurring in business news, such as "last week" or "at the end of this month," carry important information about the time context of the news document and were proved to be useful for document retrieval. We found that about 10% of these expressions are difficult to project onto the calendar due to the uncertainty about their bounds. This paper introduces a novel approach to representing temporal expressions. A user study is conducted to measure the degree of uncertainty for selected temporal expressions and a method for representing uncertainty based on fuzzy numbers is proposed. The classical Vector Space Model is extended to the Temporal Document Retrieval Model (TDRM) that incorporates the proposed fuzzy representations of temporal expressions.

[1]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[2]  Rafael Berlanga Llavori,et al.  Techniques and Tools for the Temporal Analysis of Retrieved Information , 2001, DEXA.

[3]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[4]  Rafael Berlanga Llavori,et al.  An approach to a digital library of newspapers , 1997, Inf. Process. Manag..

[5]  Sushil Jajodia,et al.  Time Granularities in Databases, Data Mining, and Temporal Reasoning , 2000, Springer Berlin Heidelberg.

[6]  Duane Szafron,et al.  Temporal granularity for unanchored temporal data , 1998, CIKM '98.

[7]  Michael G. Kahn,et al.  The visual display of temporal information , 1991, Artif. Intell. Medicine.

[8]  Frank Schilder,et al.  From Temporal Expressions To Temporal Information: Semantic Tagging Of News Messages , 2001, The Language of Time - A Reader.

[9]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[10]  Inderjeet Mani,et al.  Robust Temporal Processing of News , 2000, ACL.

[11]  Luca Chittaro,et al.  Representation of temporal intervals and relations: information visualization aspects and their evaluation , 2001, Proceedings Eighth International Symposium on Temporal Representation and Reasoning. TIME 2001.

[12]  Witold Abramowicz,et al.  Filtering the web to feed data warehouses , 2002 .

[13]  Michael P. Wellman A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[14]  Caroline M. Eastman,et al.  Response: Introduction to fuzzy arithmetic: Theory and applications : Arnold Kaufmann and Madan M. Gupta, Van Nostrand Reinhold, New York, 1985 , 1987, Int. J. Approx. Reason..

[15]  Sushil Jajodia,et al.  Temporal Databases: Theory, Design, and Implementation , 1993 .

[16]  Walter Bender,et al.  Time Frames: Temporal augmentation of the news , 2000, IBM Syst. J..

[17]  Rafael Berlanga Llavori,et al.  A Retrieval Language for Historical Documents , 1998, DEXA.

[18]  Duane Szafron,et al.  Temporal Granularity: Completing the Puzzle , 2004, Journal of Intelligent Information Systems.

[19]  Eduard Hovy,et al.  Assigning Time-Stamps to Event-Clauses , 2001, The Language of Time - A Reader.

[20]  Rafael Berlanga Llavori,et al.  Extracting Temporal References to Assign Document Event-Time Periods , 2001, DEXA.

[21]  Bonnie J. Dorr,et al.  Constraints on the Generation of Tense, Aspect, and Connecting Words from Temporal Expressions , 2002 .

[22]  Violeta Seretan,et al.  Proceedings of The Third International Conference on Language Resources and Evaluation (LREC 2002) , 2002 .

[23]  Michael E. Lesk,et al.  Computer Evaluation of Indexing and Text Processing , 1968, JACM.