Information Retrieval with Time Series Query

We study a novel information retrieval problem, where the query is a time series for a given time period, and the retrieval task is to find relevant documents in a text collection of the same time period, which contain topics that are correlated with the query time series. This retrieval problem arises in many text mining applications where there is a need to analyze text data in order to discover potentially causal topics. To solve this problem, we propose and study multiple retrieval algorithms that use the general idea of ranking text documents based on how well their terms are correlated with the query time series. Experiment results show that the proposed retrieval algorithm can effectively help users find documents that are relevant to the time series queries, which can help users analyze the variation patterns of the time series.

[1]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[2]  Eamonn J. Keogh,et al.  Relevance feedback retrieval of time series data , 1999, SIGIR '99.

[3]  ChengXiang Zhai,et al.  InCaToMi: integrative causal topic miner between textual and non-textual time series data , 2012, CIKM.

[4]  Evgeniy Gabrilovich,et al.  A word at a time: computing word relatedness using temporal semantic analysis , 2011, WWW.

[5]  G. Mitra,et al.  The handbook of news analytics in finance , 2011 .

[6]  Eamonn J. Keogh,et al.  A Probabilistic Approach to Fast Pattern Matching in Time Series Databases , 1997, KDD.

[7]  Motohide Umano,et al.  Retrieval of similar time series with similarity degree of linguistic expressions for global trend and local features , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[8]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[9]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[10]  Kazuhisa Seta,et al.  Improved method for linguistic expression of time series with global trend and local features , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[11]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[12]  Eamonn J. Keogh Fast similarity search in the presence of longitudinal scaling in time series databases , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[13]  Dimitrios Gunopulos,et al.  Time Series Similarity Measures , 2005 .

[14]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[15]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval , 2008, NAACL.

[16]  Miles Efron,et al.  Linear time series models for term weighting in information retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[17]  Richard K. Belew,et al.  Lexical dynamics and conceptual change: Analyses and implications for information retrieval , 2003 .

[18]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[19]  Anna Wilbik,et al.  Linguistic summarization of time series using linguistic quantifiers: Augmenting the analysis by a degree of fuzziness , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[20]  Dimitrios Gunopulos,et al.  Time series similarity measures (tutorial PM-2) , 2000, KDD '00.

[21]  M. Elif Karsligil,et al.  Stock price prediction using financial news articles , 2010, 2010 2nd IEEE International Conference on Information and Financial Engineering.

[22]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[23]  Giuseppe Psaila,et al.  Querying Shapes of Histories , 1995, VLDB.

[24]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[25]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[26]  Susan T. Dumais,et al.  Understanding temporal query dynamics , 2011, WSDM '11.

[27]  Fernando Diaz,et al.  Temporal profiles of queries , 2007, TOIS.

[28]  A. Mendelzon,et al.  Efficient retrieval of similar time series , 2000 .

[29]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[30]  Clive W. J. Granger Essays In Econometrics , 2001 .

[31]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[32]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.