Uniform Storage Model-based Update Scheme of On-line Information Retrieval System

In order to improve the retrieval performance of on-line information retrieval systems, an efficient index update scheme is proposed in this paper, which can provide better skipping function and further enhance both space and time efficiencies without inserting any additional auxiliary information. A uniform storage model (USM) is proposed to manage both short and long postings lists based on link. A USM-based update scheme also is proposed to distinguish long and short posting lists, which merges short lists with immediately merge, and merges long lists with improved Y-limited contiguous multiple merge scheme, which balances the trade-off of the time and space efficiencies effectively. The proposed update scheme not only considers both index level and inverted list level update, but also effectively improves time and space efficiencies of index update. Detailed experimental results and comparison with existed schemes show that the proposed scheme greatly averagely reduces space cost, conjunctive Boolean query time, and the cost of on-line index construction

[1]  Zhiyong Peng,et al.  An efficient random access inverted index for information retrieval , 2010, WWW '10.

[2]  Ben Carterette,et al.  Multiple testing in statistical analysis of systems-based information retrieval experiments , 2012, TOIS.

[3]  Charles L. A. Clarke,et al.  A Hybrid Approach to Index Maintenance in Dynamic Text Retrieval Systems , 2006, ECIR.

[4]  Wing-Kai Hon,et al.  Inverted indexes for phrases and strings , 2011, SIGIR.

[5]  Alistair Moffat,et al.  Improved word-aligned binary compression for text indexing , 2006, IEEE Transactions on Knowledge and Data Engineering.

[6]  Alistair Moffat,et al.  Fast on-line index construction by geometric partitioning , 2005, CIKM '05.

[7]  Gonzalo Navarro,et al.  Compressed full-text indexes , 2007, CSUR.

[8]  Sergei Vassilvitskii,et al.  Efficiently encoding term co-occurrences in inverted indexes , 2011, CIKM '11.

[9]  Hasan M. Jamil,et al.  A Hybrid Index Structure for Set-Valued Attributes Using Itemset Tree and Inverted List , 2010, DEXA.

[10]  Antônio de Pádua Braga,et al.  Information storage and retrieval analysis of hierarchically coupled associative memories , 2012, Inf. Sci..

[11]  Bojun Huang,et al.  Allocating inverted index into flash memory for search engines , 2011, WWW.

[12]  Charles L. A. Clarke,et al.  Indexing time vs. query time: trade-offs in dynamic information retrieval systems , 2005, CIKM '05.

[13]  Hongbo Xu,et al.  Efficient on-line index maintenance for dynamic text collections by using dynamic balancing tree , 2007, CIKM '07.