Native vs. non-native accent identification using Japanese spoken telephone numbers

In forensic investigations, it would be helpful to be able to identify a speaker's native language based on the sound of their speech. Previous research on foreign accent identification suggested that the identification accuracy can be improved by using linguistic forms in which non-native characteristics are reflected. This study investigates how native and non-native speakers of Japanese differ in reading Japanese telephone numbers, which have a specific prosodic structure called a bipodic template. Spoken Japanese telephone numbers were recorded from native speakers, and Chinese and Korean learners of Japanese. Twelve utterances were obtained from each speaker, and their F0 contours were compared between native and non-native speakers. All native speakers realised the prosodic pattern of the bipodic template while reading the telephone numbers, whereas non-native speakers did not. The metric rhythm and segmental properties of the speech samples were also analysed, and a foreign accent identification experiment was carried out using six acoustic features. By applying a logistic regression analysis, this method yielded an 81.8% correct identification rate, which is slightly better than that achieved in other studies. Discrimination accuracy between native and non-native accents was better than 90%, although discrimination between the two non-native accents was not that successful. A perceptual accent identification experiment was also conducted in order to compare automatic and human identifications. The results revealed that human listeners could discriminate between native and non-native speakers better, while they were inferior at identifying foreign accents.

[1]  Shuichi Itahashi,et al.  A discrimination method between Japanese dialects , 1992, ICSLP.

[2]  Manisha Kulshreshtha,et al.  Speaker Profiling: The Study of Acoustic Characteristics Based on Phonetic Features of Hindi Dialects for Forensic Speaker Identification , 2012 .

[3]  Timothy J. Vance,et al.  The sounds of Japanese , 2008 .

[4]  Donna A. Tate Preliminary data on dialect in speech disguise , 1979 .

[5]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[6]  Isabel Trancoso,et al.  Accent identification , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  Michael Kenstowicz,et al.  The Adaptation of Japanese Loanwords into Korean , 2006 .

[8]  Shuichi Itahashi,et al.  A method of classification among Japanese dialects , 1993, EUROSPEECH.

[9]  Philippe Boula de Mareüil,et al.  Identification of foreign-accented French using data mining techniques , 2007 .

[10]  John H. L. Hansen,et al.  Foreign accent classification using source generator based prosodic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[11]  Grit Mehlhorn,et al.  FROM RUSSIAN TO POLISH: POSITIVE TRANSFER IN THIRD LANGUAGE ACQUISITION , 2007 .

[12]  Duncan Markham Prosodic imitation: productional results , 1994, ICSLP.

[13]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[14]  M. Wrembel The Impact of Voice Quality Resetting on the Perception of a Foreign Accent in Third Language Acquisition , 2008 .

[15]  John H. L. Hansen,et al.  Language accent classification in American English , 1996, Speech Commun..

[16]  Marc A. Zissman,et al.  Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[17]  Lisa Yanguas,et al.  Incorporating linguistic knowledge into automatic dialect identification of Spanish , 1998, ICSLP.

[18]  Katerina Nicolaidis International Congress of Phonetic Sciences , 2011, Journal of the International Phonetic Association.

[19]  Y.K. Muthusamy,et al.  Reviewing automatic language identification , 1994, IEEE Signal Processing Magazine.

[20]  M. Ruhlen A Guide to the World’s Languages , 1987 .

[21]  Osamu Fujimura 1990 International Conference on Spoken Language Processing , 1992 .

[22]  Julie Vonwiller,et al.  Accent identification with a view to assisting recognition (work in progress) , 1994, ICSLP.

[23]  R. W. King,et al.  Automatic accent classification using artificial neural networks , 1993, EUROSPEECH.

[24]  M. Brewer In-group bias in the minimal intergroup situation: A cognitive-motivational analysis. , 1979 .

[25]  Pascale Fung,et al.  Fast accent identification and accented speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[26]  Julie Brousseau,et al.  Dialect-dependent speech recognizers for canadian and european French , 1992, ICSLP.

[27]  Jürgen Trouvain,et al.  On the prosody of German telephone numbers , 2001, INTERSPEECH.

[28]  John H. L. Hansen,et al.  Frequency characteristics of foreign accented speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  Irina Illina,et al.  Foreign accent identification based on prosodic parameters , 2008, INTERSPEECH.

[30]  H. Mixdorff Foreign accent in intonation patterns-a contrastive study applying a quantitative model of the F/sub 0/ contour , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[31]  J. Flege Factors affecting degree of perceived foreign accent in English sentences. , 1988, The Journal of the Acoustical Society of America.

[32]  Tae-Yeoub Jang,et al.  A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakers , 2004, INTERSPEECH.

[33]  Marc A. Zissman,et al.  Improving accent identification through knowledge of English syllable structure , 1998, ICSLP.

[34]  Harry Hollien,et al.  Forensic Voice Identification , 2001 .

[35]  William J. Poser,et al.  Evidence for foot structure in Japanese , 1990 .

[36]  Natsuko Tsujimura,et al.  An Introduction to Japanese Linguistics , 1997 .

[37]  Chin-Wu Kim The Vowel System of Korean , 1968 .

[38]  Syllable - timing Interferes with Korean Learners’ Speech of Stress - timed English , 2005 .

[39]  F. Ramus,et al.  Correlates of linguistic rhythm in the speech signal , 1999, Cognition.

[40]  G. Clark,et al.  Reference , 2008 .

[41]  Hua Lin,et al.  Mandarin Rhythm: An Acoustic Study , 2007, J. Chin. Lang. Comput..

[42]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[43]  Konstantinos Koumpis,et al.  Proceedings of the 6th International Conference on Spoken Language Processing , 2000 .

[44]  V. Dellwo,et al.  Comparing native and non-native speech rhythm using acoustic rhythmic measures: Cantonese, Beijing Mandarin and English , 2008 .

[45]  J. Flege,et al.  Talker and listener effects on degree of perceived foreign accent. , 1992, The Journal of the Acoustical Society of America.

[46]  R. W. King,et al.  Automatic accent classification of foreign accented Australian English speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[47]  C. Habel,et al.  Language , 1931, NeuroImage.

[48]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[49]  Bernard Bloch Studies in Colloquial Japanese IV Phonemics , 1950 .

[50]  Joan C. Mora,et al.  Production and Perception of Voicing Contrasts in English Word-Final Obstruents: Assessing the Effects of Experience and Starting Age , 2008 .

[51]  Hillary Anger Elfenbein,et al.  Is there an in-group advantage in emotion recognition? , 2002, Psychological bulletin.

[52]  U. Hess,et al.  An Ingroup Advantage for Confidence in Emotion Recognition Judgments: The Moderating Effect of Familiarity With the Expressions of Outgroup Members , 2006, Personality & social psychology bulletin.