论文信息 - XML Information Retrieval from Spoken Word Archives

XML Information Retrieval from Spoken Word Archives

In this paper the XML Information Retrieval System PF/Tijah is applied to retrieval tasks on large spoken document collections. The used example setting is the English CLEF-2006 CL-SR collection together with given English topics and self produced Dutch topics. The main findings presented in this paper are the easy way of adapting queries to use different kinds and combinations of metadata. Furthermore simple ways of combining different metadata kinds are shown to be beneficial in terms of mean average precision.

Djoerd Hiemstra | Franciska de Jong | Roeland Ordelman | Robin Aly | Laurens van der Werff

[1] Fredric C. Gey,et al. ENSM-SE at CLEF 2006 : Fuzzy Proximity Method with an Adhoc Influence Function in Evaluation of Multilingual and Multi-modal Information Retrieval 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain , 2007 .

[2] Philip Wadler,et al. XQuery from the Experts: A Guide to the W3C XML Query Language , 2003 .

[3] Franciska de Jong,et al. THE ROLE OF AUTOMATED SPEECH AND AUDIO ANALYSIS IN SEMANTIC MULTIMEDIA ANNOTATION , 2006 .

[4] Djoerd Hiemstra,et al. PFTijah: text search in an XML database system , 2006 .

[5] Djoerd Hiemstra,et al. TIJAH: Embracing IR Methods in XML Databases , 2005, Information Retrieval.

[6] C. M. Sperberg-McQueen,et al. Extensible Markup Language (XML) , 1997, World Wide Web J..

[7] Torsten Grust,et al. MonetDB/XQuery: a fast XQuery processor powered by a relational engine , 2006, SIGMOD Conference.

[8] Andrew Trotman,et al. The Simplest Query Language That Could Possibly Work , 2004 .