Text, Speech, and Vision for Video Segmentation: The InformediaTM Project

We describe three technologies involved in creating a digital video library suitable for fullcontent search and retrieval. Image processing analyzes scenes, speech processing transcribes the audio signal, and natural language processing determines word relevance. The integration of these technologies enables us to include vast amounts of video data in the library.