Query Expansion and Text Mining for ChronoSeeker - Search Engine for Future/Past Events -

This paper introduces a future and past search engine, ChronoSeeker, which can help users to develop long-term strategies for their organizations. To provide on-demand searches, we tackled two technical issues: (1) organizing efficient event searches and (2) filtering out noises from search results. Our system employed query expansion with typical expressions related to event information such as year expressions, temporal modifiers, and context terms for efficient event searches. We utilized a machine-learning technique of filtering noise to classify candidates into information or non-event information, using heuristic features and lexical patterns derived from a text-mining approach. Our experiment revealed that filtering achieved an 85% F-measure, and that query expansion could collect dozens more events than those without expansion.

[1]  Munmun De Choudhury,et al.  Can blog communication dynamics be correlated with stock market activity? , 2008, Hypertext.

[2]  Adam Jatowt,et al.  ChronoSeeker: search engine for future and past events , 2010, ICUIMC '10.

[3]  Kazuo Kunieda,et al.  ChronoSeeker: Future Opinion Extraction , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[4]  James Pustejovsky,et al.  Introduction to the special issue on temporal information processing , 2004, TALIP.

[5]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[6]  Jian Zhang,et al.  Daily Prediction of Major Stock Indices from Textual WWW Data , 1998, KDD.

[7]  Adam Jatowt,et al.  Supporting analysis of future-related information in news archives and the web , 2009, JCDL '09.

[8]  Jeffrey P. Bigham,et al.  Organizing and Searching the World Wide Web of Facts - Step One: The One-Million Fact Extraction Challenge , 2006, AAAI.

[9]  Alexander Podelko Multiple Dimensions of Performance Requirements , 2007, Int. CMG Conference.

[10]  R. Baeza-Yates Searching the Future , 2022 .

[11]  Adam Jatowt,et al.  Analyzing collective view of future, time-referenced events on the web , 2010, WWW '10.

[12]  David E. Millard,et al.  Artequakt: Generating Tailored Biographies with Automatically Annotated Fragments from the Web , 2002, SAAKM@ECAI.

[13]  Gilad Mishne,et al.  Predicting Movie Sales from Blogger Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[14]  J. Wolfers,et al.  Prediction Markets , 2003 .

[15]  Xiaohui Yu,et al.  ARSA: a sentiment-aware model for predicting sales performance using blogs , 2007, SIGIR.

[16]  Kentaro Inui,et al.  Experience Mining: Building a Large-Scale Database of Personal Experiences and Opinions from Web Documents , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[17]  Johan Bollen,et al.  Between Conjecture and Memento: Shaping A Collective Emotional Perception of the Future , 2008, AAAI Spring Symposium: Emotion, Personality, and Social Behavior.