Development of an automatic identification system of spoken languages: Phase I

The feasibility of a new approach to automatic language identification is examined in this pilot study. The procedure involves the application of pattern analysis techniques to features extracted from the speech signal. The database of the extracted features for five speakers from each of eight languges was divided into a learning subset and an evaluation subset. A potential function was then generated for all features in the learning subset. The complexity of the decision function was systematically increased until all members within the learning subset could be separated into the properly identified languages. Although the constraints on this pilot study necessarily precluded feature ordering and selection, the application of the decision function to the evaluation subset resulted in an over-all 84% classification accuracy.