Analyzing and Interpreting Automatically Learned Rules Across Dialects

In this paper, we demonstrate how informative dialect recognition systems such as acoustic pronunciation model (APM) help speech scientists locate and analyze phonetic rules efficiently. In particular, we analyze dialect-specific characteristics automatically learned from APM across two American English dialects. We show that unsupervised rule retrieval performs similarly to supervised retrieval, indicating that APM is useful in practical applications, where word transcripts are often unavailable. We also demonstrate that the top-ranking rules learned from APM generally correspond to the linguistic literature, and can even pinpoint potential research directions to refine existing knowledge. Thus, the APM system can help phoneticians analyze rules efficiently by characterizing large amounts of data to postulate rule candidates, so they can reserve time to conduct more targeted investigations. Potential applications of informative dialect recognition systems include forensic phonetics and diagnosis of spoken language disorders.

[1]  William M. Campbell,et al.  Discriminative n-gram selection for dialect recognition , 2009, INTERSPEECH.

[2]  Chin-Hui Lee,et al.  MAP Estimation of Continuous Density HMM : Theory and Applications , 1992, HLT.

[3]  Joseph P. Campbell,et al.  A linguistically‐informative approach to dialect recognition using dialect‐specific context‐dependent phonetic models. , 2009 .

[4]  Joseph P. Campbell,et al.  Informative dialect recognition using context-dependent pronunciation modeling , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[6]  Joseph P. Campbell,et al.  Characterizing Deletion Transformations Across Dialects Using a Sophisticated Tying Mechanism , 2011, INTERSPEECH.

[7]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[8]  Joseph P. Campbell,et al.  A linguistically-informative approach to dialect recognition using dialect-discriminating context-dependent phonetic models , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  John Algeo,et al.  AFRICAN-AMERICAN ENGLISH , 2001 .

[10]  William M. Campbell,et al.  Automatic Language Recognition Via Spectral and Token Based Approaches , 2008 .

[11]  Hagen Soltau,et al.  Discriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers , 2010 .

[12]  Janet M. Baker,et al.  The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.

[13]  N. Coupland,et al.  Sociolinguistics: A Reader and Coursebook , 1997 .

[14]  Joseph P. Campbell,et al.  Characterizing phonetic transformations and fine-grained acoustic differences across dialects , 2011 .

[15]  Hagen Soltau,et al.  Discriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers , 2010, Odyssey.

[16]  Douglas A. Reynolds,et al.  Dialect identification using Gaussian mixture models , 2004, Odyssey.

[17]  Dominique Estival,et al.  Construction of a phonotactic dialect corpus using semiautomatic annotation , 2007, INTERSPEECH.

[18]  Douglas A. Reynolds,et al.  Dialect recognition using adapted phonetic models , 2008, INTERSPEECH.

[19]  Walt Wolfram Chapter 8. African American English , 2008 .