Dialect identification: Impact of differences between read versus spontaneous speech

Automatic Dialect Classification (ADC) has recently gained substantial interest in the field of speech processing. Dialects of a language normally are reflected in terms of their phoneme space, word pronunciation/selection, and prosodic traits. These traits are clearly visible in natural speaker-to-speaker spontaneous conversations. However, dialect cues in prompted/read speech are often neglected by the community. In this study, we consider a systematic assessment of the differences between the acoustic characteristics of spontaneous and read speech and their effects on dialect identification performance. By examining both the model space and phoneme space of read and spontaneous dialect speech, we observe that each spans different dialect spaces and with distinct characteristics that need to be addressed respectively. From this comparison, we find useful clues to design more efficient identification systems. Finally, we also propose a novel feature extraction technique, PMVDR-SDC, and obtain a +26.4% relative improvement in dialect recognition rate.