Spatio‐temporal pseudo relevance feedback for scientific data retrieval

We consider the problem of searching scientific data from vast heterogeneous scientific data repositories. This problem is challenging because scientific data contain relatively little text information compared to other search targets such as web pages. On the other hand, the metadata in scientific data contain other characteristic information such as spatio-temporal information. Although using this information make it possible to improve the search performance, many widely adopted scientific data search engines use this information exclusively for narrowing down search results. In this paper, we propose a novel query generation method using spatial, temporal, and text information based on pseudo relevance feedback. The proposed method generates new spatio-temporal queries from the initial search results. By using these queries, the search results are reranked such that more related results obtain higher rank. The experimental results show that the proposed method outperforms a baseline method when search targets do not have rich text information. © 2016 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

[1]  Peretz Shoval,et al.  Information Filtering: Overview of Issues, Research and Systems , 2001, User Modeling and User-Adapted Interaction.

[2]  Jürgen Umbrich,et al.  Searching and browsing Linked Data with SWSE: The Semantic Web Search Engine , 2011, J. Web Semant..

[3]  C. Moens,et al.  Whole mount RNA in situ hybridization on zebrafish embryos: hybridization. , 2008, CSH protocols.

[4]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[5]  Mounia Lalmas,et al.  A survey on the use of relevance feedback for information access systems , 2003, The Knowledge Engineering Review.

[6]  Alexander S. Szalay,et al.  Online scientific data curation, publication, and archiving , 2002, SPIE Astronomical Telescopes + Instrumentation.

[7]  Quan Z. Sheng,et al.  The Self-Serv Environment for Web Services Composition , 2003, IEEE Internet Comput..

[8]  Joemon M. Jose,et al.  Temporal Pseudo-relevance Feedback in Microblog Retrieval , 2012, ECIR.

[9]  Lin Ziyu,et al.  Hybrid pseudo-relevance feedback for microblog retrieval , 2013 .

[10]  Hui Xiong,et al.  Mining Co-Location Patterns with Rare Events from Spatial Data Sets , 2006, GeoInformatica.

[11]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[12]  Peng-Yeng Yin,et al.  A new relevance feedback technique for iconic image retrieval based on spatial relationships , 2009, J. Syst. Softw..

[13]  Eloy Gonzales,et al.  Searching inter-disciplinary scientific big data based on latent correlation analysis , 2013, 2013 IEEE International Conference on Big Data.

[14]  Rajkumar Buyya,et al.  A taxonomy of scientific workflow systems for grid computing , 2005, SGMD.

[15]  James Allan,et al.  Interactive Information Retrieval Using Clustering and Spatial Proximity , 2004, User Modeling and User-Adapted Interaction.