Text independent language recognition system using DHMM with new features

Spoken Language Identification is a task of recognizing the language from an unknown utterance of speech. This paper describes a text independent language recognition system using new features derived from MFCC feature of speech signal with a common code book and discrete hidden Markov models (DHMM) to achieve a very good LID recognition performance with less computation time comparing with that of a state of art phone based systems available in literature. In this work, MFCC feature vectors of speech signal are transformed into new feature vectors. This LID approach includes generation of a common codebook using new features and training of DHMM, one for each language. The experiments are carried out on the database of OGI and Indian language consists of six languages namely Telugu, Tamil, Hindi, Marathi, Malayalam and Kannada.