Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech

A dialect identification technique is described that takes as input extemporaneous, conversational speech spoken in Latin American Spanish and produces as output a hypothesis of the dialect. The system has been trained to recognize Cuban and Peruvian dialects of Spanish, but could be extended easily to other dialects (and languages) as well. Building on our experience in automatic language identification, the dialect-ID system uses an English phone recognizer trained on the TIMIT corpus to tokenize training speech spoken in each Spanish dialect. Phonotactic language models generated from this tokenized training speech are used during testing to compute dialect likelihoods for each unknown message. This system has an error rate of 16% on the Cuban/Peruvian two-alternative forced-choice test. We introduce the new "Miami" Latin American Spanish speech corpus that is capable of supporting our research efforts into the future.

[1]  Shuichi Itahashi,et al.  A method of classification among Japanese dialects , 1993, EUROSPEECH.

[2]  Y. Patel,et al.  An integrated multi-dialect speech recognition system with optional speaker adaptation , 1995, EUROSPEECH.

[3]  Steve J. Young,et al.  The HTK tied-state continuous speech recogniser , 1993, EUROSPEECH.

[4]  Julie Brousseau,et al.  Dialect-dependent speech recognizers for canadian and european French , 1992, ICSLP.

[5]  Hynek Hermansky,et al.  RASTA-PLP speech analysis technique , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  J. Cleary,et al.  \self-organized Language Modeling for Speech Recognition". In , 1997 .

[7]  John K. Dewey,et al.  Speech recognition of foreign accent , 1994 .

[8]  Y.K. Muthusamy,et al.  Reviewing automatic language identification , 1994, IEEE Signal Processing Magazine.

[9]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[10]  Marc A. Zissman Language identification using phoneme recognition and phonotactic language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[11]  R. W. King,et al.  Automatic accent classification using artificial neural networks , 1993, EUROSPEECH.

[12]  Ronald A. Cole,et al.  The OGI multi-language telephone speech corpus , 1992, ICSLP.

[13]  J. Flege Factors affecting degree of perceived foreign accent in English sentences. , 1988, The Journal of the Acoustical Society of America.

[14]  D. Reynolds,et al.  Pc-based Tms320c30 Implementation of the Gaussian Mixture Model Text-independent Speaker Recognition System , 2022 .

[15]  John H. L. Hansen,et al.  Foreign accent classification using source generator based prosodic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.