Evaluating different information retrieval algorithms on real-world data

More and more data is produced in the form of videos, which are opaque to textual queries. To allow searching in video data collections, two problems have to be solved: The automatic generation of a searchable index, and the effective search in the automatically produced and therefore imperfect index. The ISL View4You system is a prototype of a video indexing and retrieval system which both generates the index and provides a search engine to access it. An end to end evaluation was carried out using real-world data and queries from naive subjects. From the results it can be concluded, errors of the overall system are not due to the index generation, but are introduced by the information retrieval engine (the search). Therefore, the focus of this paper is a comparison of two di erent search algorithms, LSI (latent semantic indexing) and Okapi (a avor of the traditional classic vector model approach). The evaluation is carried out on the automatically produced index on a relatively small database, which allows for full manual relevance judgement.