Recent Advances In Cantonese Speech Recognition

This paper describes our recent work on automatic recognition of Cantonese. Cantonese is one of the major Chinese dialects, spoken by tens of millions of people in Southern China and Hong Kong. For isolated Cantonese syllables, a neural network based recognition algorithm has been successfully developed and the most up-to-date recognition results are presented. For continuous Cantonese speech, the problem of tone recognition has been tackled rst using multi-layer perceptron network (MLP). Then a baseline continuous speech recognition system is built by integrating phone-based hidden Markov models (HMM) and the MLP tone recognizer. A number of important design issues, including speech corpus design, language modeling and Chinese character access, are discussed in this paper.