Video indexing using multimodal information