Combining Strategies for XML Retrieval

This paper describes Peking University's approaches to the Ad Hoc, Data Centric and Relevance Feedback track. In Ad Hoc track, results for four tasks were submitted, Efficiency, Restricted Focused, Relevance In Context and Restricted Relevance In Context. To evaluate the relevance between documents and a given query, multiple strategies, such as Two-Step retrieval, MAXLCA query results, BM25, distribution measurements and learn-to-optimize method are combined to form a more effective search engine. In Data Centric track, to gain a set of closely related nodes that are collectively relevant to a given keyword query, we promote three factors, correlation, explicitnesses and distinctiveness. In Relevance Feedback track, to obtain useful information from feedbacks, our implementation employs two techniques, a revised Rocchio algorithm and criterion weight adjustment.

[1]  Mounia Lalmas,et al.  A survey on the use of relevance feedback for information access systems , 2003, The Knowledge Engineering Review.

[2]  Yi Chen,et al.  Identifying meaningful return information for XML keyword search , 2007, SIGMOD '07.

[3]  Feng Shao,et al.  XRANK: ranked keyword search over XML documents , 2003, SIGMOD '03.

[4]  Andrew Trotman,et al.  Overview of the INEX 2009 Ad Hoc Track , 2009, INEX.

[5]  Hang Yu,et al.  ListOPT: Learning to Optimize for XML Ranking , 2011, PAKDD.

[6]  Gerhard Weikum,et al.  An Efficient and Versatile Query Engine for TopX Search , 2005, VLDB.

[7]  G. Salton,et al.  XI-1 Interactive Search Strategies and Dynamic File Organization in Information Retrieval , 2009 .

[8]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[9]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[10]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[11]  John D. Lafferty,et al.  Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[12]  Andrew Trotman,et al.  Focused Retrieval and Evaluation, 8th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2009, Brisbane, Australia, December 7-9, 2009, Revised and Selected Papers , 2010, INEX.

[13]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[14]  Yi Chen,et al.  eXtract: a snippet generation system for XML search , 2008, Proc. VLDB Endow..

[15]  David Carmel,et al.  Searching XML documents via XML fragments , 2003, SIGIR.

[16]  Jinglei Zhao,et al.  A proximity language model for information retrieval , 2009, SIGIR.