Metadata information plays a crucial role in augmenting document organising efficiency and archivability. News metadata includes DateLine, ByLine, HeadLine and many others. We found that HeadLine information is useful for guessing the theme of the news article. Particularly for financial news articles, we found that HeadLine can thus be specially helpful to locate explanatory sentences for any major events such as significant changes in stock prices. In this paper we explore a support vector based learning approach to automatically extract the HeadLine metadata. We find that the classification accuracy of finding the HeadLines improves if DateLines are identified first. We then used the extracted HeadLines to initiate a pattern matching of keywords to find the sentences responsible for story theme. Using this theme and a simple language model it is possible to locate any explanatory sentences for any significant price change.
[1]
Eric Brill,et al.
A Simple Rule-Based Part of Speech Tagger
,
1992,
HLT.
[2]
Sandip Debnath,et al.
Modelling Information Incorporation in Markets, with Application to Detecting and Explaining Events
,
2002,
UAI.
[3]
W. Bruce Croft,et al.
A language modeling approach to information retrieval
,
1998,
SIGIR '98.
[4]
Sandip Debnath,et al.
Finding Base Time-Line of a News Article
,
2005,
FLAIRS.
[5]
Robert Krovetz,et al.
Viewing morphology as an inference process
,
1993,
Artif. Intell..