Automatic Language Recognition on Spontaneous Speech: The ATVS-UAM System

During the last decades the need for fast searching of speech recordings has rapidly developed. One of the crucial technologies for searching and further processing this material is automatic language recognition on spontaneous speech. A brief introduction to this technology is provided and the system submitted by ATVS-UAM to the 2007 International Competitive Language Recognition Evaluation (LRE) organized by the National Institute of Standards and Technology (NIST) of the United States is presented. The ATVS-UAM system submitted to this evaluation, as most state-of-the-art language recognition systems, consists of the fusion of different subsystems operating at different levels of the speech signal, namely, phonotactic and acoustic subsystems. The fusion is performed in a hierarchical way, thus avoiding the need for a trained scheme for system combination. Moreover scores from the language detectors are delivered in the form of calibrated log-likelihood ratios, which are interpretable by humans without further transformation. This system achieved an equal error rate (EER) performance of about 5% on 30-second conversational telephone speech segments in a closed-set condition with 14 languages, a very reasonable performance level for real applications.