A systematic strategy for robust automatic dialect identification

Automatic dialect Classification is very important for speech based human computer interface and customer electronic products. Although many studies have been performed in ideal environment, little work has been done in noisy or small data corpus, both of which are very critical for the survival of a dialect identification system. This paper investigates a series of strategies to address the question of small and noisy dataset dialect classification task. A novel hierarchical universal background model is proposed to address the question of limited training dataset. To address the noisy question, we initiate the use of perceptual minimum variance distortionless response (PMVDR), combining with shifted delta cepstral (SDC) algorithm. Rotation forest is also explored to further improve the system performance. Finally, compared with the baseline system, the proposed best system shows relative gains of 31:8% and 28:7%, in the worse noise and clean condition on a small data set, respectively.