Automatic language recognition using acoustic features

Two language recognition algorithms are proposed and some experimental results are described. While many studies have been done concerning the speech recognition problem, few studies have addressed the language recognition task. The speech data used contains 20 languages: 16 sentences uttered twice by 4 males and 4 females. The duration of each sentence is about 8 seconds. The first algorithm is based on the standard vector quantization (VQ) technique. Every language is characterized by its own VQ codebook. The second algorithm is based on a single universal (common) VQ codebook for all languages, and its occurrence probability histograms. Every language is characterized by a histogram. The experiment results show that the recognition rates for the first and second algorithms were 65% and 80%, respectively, each using just 8 sentences of unknown speech (about 64 seconds).<<ETX>>

[1]  B. Atal Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.

[2]  Y. Tohkura,et al.  A weighted cepstral distance measure for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  A. Gray,et al.  Distance measures for speech processing , 1976 .

[4]  Aaron E. Rosenberg,et al.  On the use of instantaneous and transitional spectral information in speaker recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  K. Shikano,et al.  LPC peak weighted spectral matching measures , 1981 .

[6]  Katsuhiko Shirai,et al.  Speaker identification based on frequency distribution of vector-quantized spectra , 1988, Systems and Computers in Japan.

[7]  Kiyohiro Shikano,et al.  Frequency weighted LPC spectral matching measures , 1982 .

[8]  A. House,et al.  Toward automatic identification of the language of an utterance. I. Preliminary methodological con , 1977 .

[9]  F. J. Goodman,et al.  Improved automatic language identification in noisy speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[10]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[11]  R.A. Cole,et al.  Language identification with neural networks: a feasibility study , 1989, Conference Proceeding IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[12]  J. Foil,et al.  Language identification using noisy speech , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.