Automatic processing of broadcast audio in multiple languages

This paper addresses recent progress in LVCSR in multiple languages which has enabled the processing of broadcast audio for information access. At LIMSI, broadcast news transcription systems have been developed for seven languages. Automatic processing to access the content must take into account the specificities of audio data, such as needing to deal with the continuous data stream and an imperfect word transcription, and specificities of the language. Some near-term applications are audio data mining, structurization of audiovisual archives, selective dissemination of information and media monitoring.