Finding Base Time-Line of a News Article

An event without a time-line does not carry much information. Description of an event is useful only when it can be augmented with the time-line of its occurrence. This is more important with the on-line publishing of news articles. News articles are nothing but a set of text-based descriptions of events. Therefore the actual time-lines of the article as well as each individual event are most important ingredients for their informativeness. We introduce a novel approach to find the actual time-lines of news articles whenever available, and tag them with this temporal information. This involves a temporal baseline, which needs to be established for the entire article. Temporal baseline is defined as the date (and possibly time) of when the article had first been published, as stated in the article itself. Without a precise and correct temporal baseline, no further processing of individual events can be possible. We approached this problem of accurately finding the temporal baseline, with a Support-Vector based classification method. We found that the proper choice of parameters to train the Support-Vector classifier can result in high accuracy. We showed the data collection phase, training phase, and the testing phase and report the accuracy of our method for news articles from 26 different Websites. From this result we can claim that our approach can be used to find the temporal baseline of a news article very accurately.

[1]  Hans Reichenbach,et al.  The Tenses of Verbs , 2005, The Language of Time - A Reader.

[2]  Lenhart K. Schubert,et al.  Interpreting Tense, Aspect and Time Adverbials: A Compositional, Unified Approach , 1994, ICTL.

[3]  Sandip Debnath,et al.  Automatic extraction of informative blocks from webpages , 2005, SAC '05.

[4]  Janet Marie Hitzeman Temporal adverbials and the syntax-semantics interface , 1994 .

[5]  James F. Allen Natural language understanding (2nd ed.) , 1995 .

[6]  Rebecca J. Passonneau,et al.  A Computational Model of the Semantics of Tense and Aspect , 1988, CL.

[7]  Inderjeet Mani,et al.  Robust Temporal Processing of News , 2000, ACL.

[8]  James F. Allen Towards a General Theory of Action and Time , 1984, Artif. Intell..

[9]  Bonnie J. Dorr,et al.  Deriving Verbal and Compositonal Lexical Aspect for NLP Applications , 1997, ACL.

[10]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[11]  Walter Bender,et al.  Time Frames: Temporal augmentation of the news , 2000, IBM Syst. J..

[12]  Alex Lascarides,et al.  Temporal interpretation, discourse relations and commonsense entailment , 1993, The Language of Time - A Reader.

[13]  Uwe Reyle,et al.  From discourse to logic , 1993 .

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[15]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.