Content and structure summarisation of XML documents for effective information access

Digital libraries and other information providers make extensive use of the XML standard when publishing information. One of the benefits that XML presents is that it makes the logical structure of documents available. Overviews of the logical structure, as well as of the content, of XML documents can be used for providing effective access to the information stored within DL systems. In this paper, we describe three steps of an exploratory research into the use of automatic summarisation of XML documents for providing effective information access: we investigate the usefulness of the summarisation of the content of XML document elements, we examine the summarisation of the structure of XML documents by means of query-dependent table of contents, and we describe our current work into estimating query independent element features that can be used for generating generic summaries of document structure.

[1]  Mounia Lalmas,et al.  INEX 2002 - 2006: Understanding XML Retrieval Evaluation , 2007, DELOS.

[2]  Norbert Fuhr,et al.  Designing a User Interface for Interactive Retrieval of Structured Documents - Lessons Learned from the INEX Interactive Track , 2006, ECDL.

[3]  Pia Borlund,et al.  The IIR evaluation model: a framework for evaluation of interactive information retrieval systems , 2003, Inf. Res..

[4]  Ludovic Denoyer,et al.  The Wikipedia XML Corpus , 2006, INEX.

[5]  Mounia Lalmas,et al.  Feature- and Query-Based Table of Contents Generation for XML Documents , 2007, ECIR.

[6]  Börkur Sigurbjörnsson,et al.  Focused information access using XML element retrieval , 2006 .

[7]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[8]  Maarten de Rijke,et al.  Length normalization in XML retrieval , 2004, SIGIR '04.

[9]  Mounia Lalmas,et al.  The Use of Summaries in XML Retrieval , 2006, ECDL.

[10]  Susan T. Dumais,et al.  Optimizing search by showing results in context , 2001, CHI.

[11]  Gerhard Weikum,et al.  An Efficient and Versatile Query Engine for TopX Search , 2005, VLDB.

[12]  Ryen W. White,et al.  A task-oriented study on the influencing effects of query-biased summarisation in web searching , 2003, Inf. Process. Manag..

[13]  Mounia Lalmas,et al.  Investigating the use of summarisation for interactive XML retrieval , 2006, SAC.

[14]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[15]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[16]  Birger Larsen,et al.  Report on the INEX 2004 interactive track , 2005, SIGF.

[17]  Djoerd Hiemstra,et al.  A Linguistically Motivated Probabilistic Model of Information Retrieval , 1998, ECDL.

[18]  Birger Larsen,et al.  Users, structured documents and overlap: interactive searching of elements and the influence of context on search behaviour , 2006, IIiX.

[19]  T. E. R. Singer,et al.  Abstracting scientific and technical literature;: An introductory guide and text for scientists, abstractors, and management , 1971 .