On the SPOT : Question Answering over Temporally Enhanced Structured Data

Web content increasingly reflects the current state of the physical and social world, manifested both in traditional news media sources along with user-generated publishing sites such as Twitter, Foursquare, and Facebook. At the same time, web searching increasingly reflects problems grounded in the real world. As a result of this blending of the web with the real world, we observe that the web, both in its composition and use, has incorporated many of the dynamics of the real world. Few of the problems associated with searching dynamic collections are well understood, such as defining time-sensitive relevance, understanding user query behavior over time and understanding why certain web content changes. We believe that, just as static collections often benefit from modeling topics, dynamic collections will likely benefit from temporal modeling of events and time-sensitive user interests and intents, which were rarely addressed in the literature. There have been preliminary efforts in the research and industrial communities to address algorithms, architectures, evaluation methodologies and metrics. We aim to bring together practitioners and researchers to discuss their recent breakthroughs and the challenges with addressing time-aware information access, both from the algorithmic and the architectural perspectives. This workshop is a successor to the successful SIGIR 2012 Workshop on Time Aware Information Access (#TAIA2012). Where the 2012 edition was the first to bring together a broad set of academic and industrial researchers around the topic of time-aware information access, the specific focus of this workshop is on the many time-aware benchmarking activities that are ongoing in 2013.

[1]  Luciana S. Buriol,et al.  Temporal Analysis of the Wikigraph , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[2]  Anselmo Peñas,et al.  Temporally Anchored Relation Extraction , 2012, ACL.

[3]  Denilson Barbosa,et al.  Extracting information networks from the blogosphere , 2012, TWEB.

[4]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[5]  Gerhard Weikum,et al.  Natural Language Questions for the Web of Data , 2012, EMNLP.

[6]  Adam Jatowt,et al.  Studying how the past is remembered: towards computational history through large scale text mining , 2011, CIKM '11.

[7]  Andrew McCallum,et al.  Learning to Create Customized Authority Lists , 2000, ICML.

[8]  Susan T. Dumais,et al.  Leveraging temporal dynamics of document content in relevance ranking , 2010, WSDM '10.

[9]  Özgür Ulusoy,et al.  A Practitioner's Guide for Static Index Pruning , 2009, ECIR.

[10]  M. de Rijke,et al.  Cognitive Temporal Document Priors , 2013, DIR.

[11]  Iadh Ounis,et al.  Overview of the TREC 2011 Microblog Track , 2011, TREC.

[12]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[13]  Tommaso Caselli,et al.  SemEval-2010 Task 13: TempEval-2 , 2010, *SEMEVAL.

[14]  Gerhard Weikum,et al.  A Language Modeling Approach for Temporal Information Needs , 2010, ECIR.

[15]  M. de Rijke,et al.  Semantic Document Selection - Historical Research on Collections That Span Multiple Centuries , 2012, TPDL.

[16]  Ben Carterette,et al.  Within-Document Term-Based Index Pruning with Statistical Hypothesis Testing , 2011, ECIR.

[17]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[18]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[19]  Brian Roark,et al.  Query-focused Supervised Sentence Ranking for Update Summaries , 2008, TAC.

[20]  András A. Benczúr,et al.  To randomize or not to randomize: space optimal summaries for hyperlink analysis , 2006, WWW '06.

[21]  Stephen E. Robertson,et al.  The TREC 2002 Filtering Track Report , 2002, TREC.

[22]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[23]  Sung-Bae Cho,et al.  Personalized mining of web documents using link structures and fuzzy concept networks , 2007, Appl. Soft Comput..

[24]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[25]  Nattiya Kanhabua,et al.  Identifying Relevant Temporal Expressions for Real-World Events , 2012 .

[26]  Miles Efron,et al.  Estimation methods for ranking recent information , 2011, SIGIR.

[27]  John D. Lafferty,et al.  Information retrieval as statistical translation , 1999, SIGIR '99.

[28]  Kuan-Yu Chen,et al.  Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling , 2007, IEEE Transactions on Knowledge and Data Engineering.

[29]  Padhraic Smyth,et al.  Algorithms for estimating relative importance in networks , 2003, KDD '03.

[30]  Grigorios Tsoumakas,et al.  Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams , 2006 .

[31]  Christopher Olston,et al.  Search result diversity for informational queries , 2011, WWW.

[32]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[33]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[34]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[35]  Inderjeet Mani,et al.  Robust Temporal Processing of News , 2000, ACL.

[36]  Charles L. A. Clarke,et al.  A document-centric approach to static index pruning in text retrieval systems , 2006, CIKM '06.

[37]  Roi Blanco,et al.  Probabilistic static pruning of inverted files , 2010, TOIS.

[38]  Ruey-Cheng Chen,et al.  Information preservation in static index pruning , 2012, CIKM '12.

[39]  A. Agresti An introduction to categorical data analysis , 1997 .

[40]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[41]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[42]  W. Bruce Croft,et al.  Time-based language models , 2003, CIKM '03.

[43]  Gerhard Weikum,et al.  YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.

[44]  Wenjie Li,et al.  Developing learning strategies for topic-based summarization , 2007, CIKM '07.

[45]  Stuart M. Shieber,et al.  Towards Robust Context-Sensitive Sentence Alignment for Monolingual Corpora , 2006, EACL.

[46]  Peter Mika,et al.  Searching through time in the New York Times HCIR Challenge 2010 , 2010 .

[47]  Fernando Diaz,et al.  Temporal profiles of queries , 2007, TOIS.

[48]  Scott Mebberson,et al.  User Interface Components , 2003 .

[49]  Mario A. Nascimento,et al.  Locality-Based pruning methods for web search , 2008, TOIS.

[50]  Hai Leong Chieu,et al.  Query based event extraction along a timeline , 2004, SIGIR '04.

[51]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[52]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[53]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[54]  Jaeyoung Chang,et al.  Integrating Incremental Feature Weighting into NaÏve Bayes Text Classifier , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[55]  Feng Niu,et al.  Building an Entity-Centric Stream Filtering Test Collection for TREC 2012 , 2012, TREC.

[56]  Michael Gertz,et al.  Multilingual and cross-domain temporal tagging , 2012, Language Resources and Evaluation.

[57]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[58]  James Allan,et al.  Introduction to topic detection and tracking , 2002 .

[59]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[60]  Peter J. Diggle,et al.  Statistics and Scientific Method: An Introduction for Students and Researchers , 2011 .

[61]  Michael Gertz,et al.  HeidelTime: High Quality Rule-Based Extraction and Normalization of Temporal Expressions , 2010, *SEMEVAL.

[62]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[63]  Hannah M. Nash,et al.  The influence of connectives on young readers' processing and comprehension of text. , 2011 .

[64]  James Pustejovsky,et al.  SemEval-2007 Task 15: TempEval Temporal Relation Identification , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[65]  Miles Efron,et al.  Linear time series models for term weighting in information retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[66]  Craig MacDonald,et al.  Evaluating Real-Time Search over Tweets , 2012, ICWSM.

[67]  Gerhard Weikum,et al.  RDF Xpress: a flexible expressive RDF search engine , 2012, SIGIR '12.

[68]  Regina Barzilay,et al.  Columbia’s Newsblaster: New Features and Future Directions , 2003, NAACL.

[69]  W. Churchill,et al.  Second World War , 1948 .

[70]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[71]  M. de Rijke,et al.  Entity Network Extraction Based on Association Finding and Relation Extraction , 2013, TPDL.

[72]  M. de Rijke,et al.  Time-Aware Exploratory Search: Exploring Word Meaning through Time , 2012 .

[73]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[74]  Jong-Hyeok Lee,et al.  Mining the blogosphere for top news stories identification , 2010, SIGIR '10.

[75]  M. de Rijke,et al.  A subjunctive exploratory search interface to support media studies researchers , 2012, SIGIR '12.

[76]  Ronald Fagin,et al.  Static index pruning for information retrieval systems , 2001, SIGIR '01.

[77]  Ruey-Cheng Chen,et al.  An information-theoretic account of static index pruning , 2013, SIGIR.

[78]  M. de Rijke,et al.  A Cascaded Machine Learning Approach to Interpreting Temporal Expressions , 2007, NAACL.

[79]  Ryen W. White,et al.  Exploratory Search , 2008 .

[80]  Luis Gravano,et al.  Answering General Time-Sensitive Queries , 2008, IEEE Transactions on Knowledge and Data Engineering.

[81]  Gilad Mishne,et al.  Towards recency ranking in web search , 2010, WSDM '10.

[82]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[83]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[84]  M. de Rijke,et al.  Ranking related entities: components and analyses , 2010, CIKM.

[85]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[86]  Wai Lam,et al.  MEAD - A Platform for Multidocument Multilingual Text Summarization , 2004, LREC.

[87]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.