Twenty-One: a baseline for multilingual multimedia retrieval

In this paper we will give a short overview of the ideas underpinning the demonstrator developed within the EU-funded project Twenty-One; this system provides for the disclosure of information in a heterogeneous document environment that includes documents of different types and languages. As part of the off-line document processing that has been integrated in the system noun phrases are extracted to build a phrase-based index. They are the starting point for the generation of both a fuzzy phrase index and a translation step that is needed for the realisation of cross-language retrieval functionality.