Audio indexing technology for the exploration of audiovisual heritage collections

A number of techniques from the AI-realm have proven to have a dded value for spoken document retrieval. Browsing tools for audio and/or video archives not only bene fit from speech recognition, but also from techniques such as clustering, topic detection, speaker cl assification and segmentation. This paper will discuss audio indexing tools that have been implemented for the disclosure of Dutch audiovisual cultural heritage collections, and will analyze the specific require ments imposed by the nature and formats of the collections from a technological point of view. Moreover, t he paper argues that research is needed to cope with the varying information needs for different types of us ers. The number of digital audio collections in the cultural heri tage domain is growing rapidly. Whereas the growth of storage capacity is in accordance with widely ackn owledged predictions, the possibilities to index and access these archives is lagging behind. As a result, par ticul information may only be accessible via manual browsing of a collection of files, which is extremely t ime-consuming. Recent years have shown that automatic speech recognition can successfully be depl oy d for equipping spoken-word collections with search functionality. This is especially the case in the bro adcast news domain. For that domain speech transcripts approximate the quality of manual transcripts for several languages. In other domains, a similar recognition performance is usually harder to obtain due to ( i) a lack of domain-specific training data, in addition to (ii) a large variability in audio quality, speec h haracteristics and topics that are addressed. This applies to historical, audio(visual) data in particular. T he application of audio indexing to Dutch historical audio collections, however, may greatly improve their acce ssibility.