Language-identification based on cross-language acoustic models and optimised information combination

decoding, the second transforms the parameters from This work is concerned with the subject of languagethe decoding module and classifies the language. identification (LID). Two central issues are addressed. The common acoustic signal preprocessor calculates The first is to analyse the trade-off between detailed 12 RASTA filtered MFCC’s, their first derivatives and acoustic modelling and robust estimation of acoustic the delta-log-energy. The phone and language decoding and language models. The second to find the optimal module consists of three parallel branches. In each of combination of acoustic and language scores for languagethese the phone recogniser matches the acoustic identification. parameters to the acoustic models used by that recogniser. Experiments are carried out using the three languages The output from each recogniser is further matched American-English, German and Spanish from the OGI-TS against three language models. database. It is shown that on the average the acoustic The combined output X from all language models modelling is able to recognise 46.3% of the phones correctly and from all recognisers are used as input to the across the three languages. Insertion and deletion rate ‘information combination and the language-classification’ is 35.7% and 6.6%, respectively. Language-identification module (ICLC). This module enforces a transformation performance is 82.6% with the full set of acoustic models. onto the parameters X and estimates the most probable The performance is increased to 83.7% after having language given the acoustic input. conducted 80 iterations of a hierarchical clustering in which phones are merged across the languages.