Exploiting structure for information retrieval

Structured elements are pervasive in digital libraries, product catalogs, scientific data collections and on the Internet. One of our research aims is to investigate the ways in which the additional structure of a collection can be brought to bear on retrieval effectiveness. This paper reports on our experiments on the use of manually assigned keywords in domain specific collections; on the use of URL and link structure on the Internet; and on the use of XML-structure in annotated scientific collections.

[1]  Weiyi Meng,et al.  Using the Structure of HTML Documents to Improve Retrieval , 1997, USENIX Symposium on Internet Technologies and Systems.

[2]  Garrison W. Cottrell,et al.  Predicting the performance of linearly combined IR systems , 1998, SIGIR '98.

[3]  Loren G. Terveen,et al.  Does “authority” mean quality? predicting expert quality ratings of Web documents , 2000, SIGIR '00.

[4]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[5]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[6]  David Hawking,et al.  Overview of the TREC-2001 Web track , 2002 .

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Carol Peters,et al.  Cross-Language Information Retrieval and Evaluation , 2001, Lecture Notes in Computer Science.

[9]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[10]  Maarten de Rijke,et al.  Shallow Morphological Analysis in Monolingual Information Retrieval for Dutch, German, and Italian , 2001, CLEF.

[11]  Stephen E. Robertson,et al.  Effective site finding using link anchor information , 2001, SIGIR '01.

[12]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[13]  J. Gower,et al.  Metric and Euclidean properties of dissimilarity coefficients , 1986 .

[14]  Joon Ho Lee,et al.  Combining multiple evidence from different properties of weighting schemes , 1995, SIGIR '95.

[15]  Maarten de Rijke,et al.  The University of Amsterdam at CLEF 2003 , 2001, CLEF.

[16]  Jaap Kamps,et al.  The University of Amsterdam at INEX 2006 , 2002 .

[17]  Ellen M. Voorhees,et al.  The eleventh text REtrieval conference, TREC 2002 , 2003 .