Probabilistic Approaches to Video Retrieval

Our experiments for TRECVID 2004 further investigate the applicability of the so-called “Generative Probabilistic Models to video retrieval”. TRECVID 2003 results demonstrated that mixture models computed from video shot sequences improve the precision of “query by examples” results when compared to models computed from keyframes. This year, we extended these video models to capture more complex temporal events, by building generative probabilistic models from the shots using the full covariance matrix instead of a diagonal covariance matrix. Also, we improved upon the models of the associated textual data (from ASR and OCR) by introducing a multi-layered hierarchical language model. Finally, we tried to take advantage of the information in the audio channel. In the interactive experiments, we experimented with the automatic selection of the media representation (visual, textual or combined) that is most informative for answering the user’s information need.