XML retrieval: what to retrieve?

The fundamental difference between standard information retrieval and XML retrieval is the unit of retrieval. In traditional IR, the unit of retrieval is fixed: it is the complete document. In XML retrieval, every XML element in a document is a retrievable unit. This makes XML retrieval more difficult: besides being relevant, a retrieved unit should be neither too large nor too small. The research presented here, a comparative analysis of two approaches to XML retrieval, aims to shed light on which XML elements should be retrieved. The experimental evaluation uses data from the Initiative for the Evaluation of XML retrieval (INEX 2002).