Swiss French Regional Accent Identification

In this paper an attempt is made to automatically recognize the speaker’s accent among regional Swiss French accents from four different regions of Switzerland, i.e. Geneva (GE), Martigny (MA), Neuchˆatel (NE) and Nyon (NY). To achieve this goal, we rely on a generative probabilistic framework for classification based on Gaussian mixture modelling (GMM). Two different GMM-based algorithms are investigated: (1) the baseline technique of universal background modelling (UBM) followed by maximum-a-posteriori (MAP) adaptation, and (2) total variability (i-vector) modelling. Both systems perform well, with the i-vector-based system outperforming the baseline system, achieving a relative improvement of 17.1% in the overall regional accent identification accuracy.

[1]  Philippe Boula de Mareüil,et al.  Identification of foreign-accented French using data mining techniques , 2007 .

[2]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  W. J. DeCoursey,et al.  Introduction: Probability and Statistics , 2003 .

[5]  Hui Liang,et al.  A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for HMM-based speech synthesis , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Sébastien Marcel,et al.  Spear: An open source toolbox for speaker recognition based on Bob , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  John Laver,et al.  Principles of Phonetics: Principles of transcription , 1994 .

[8]  Björn W. Schuller,et al.  Hidden Markov model-based speech emotion recognition , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[9]  J. Durand,et al.  Phonologie, variation et accents du français , 2009 .

[10]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[11]  Mohamed Kamal Omar,et al.  A novel approach to detecting non-native speakers and their native language , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Douglas A. Reynolds,et al.  Language identification using Gaussian mixture model tokenization , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Anne-Catherine Simon,et al.  La variation prosodique régionale en français , 2012 .

[14]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[15]  Julia Hirschberg,et al.  Dialect Recognition Using a Phone-GMM-Supervector-Based SVM Kernel , 2010 .

[16]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[17]  Hugo Van hamme,et al.  Accent recognition using i-vector, Gaussian Mean Supervector and Gaussian posterior probability supervector for spontaneous telephone speech , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Joseph A. Wolkan,et al.  Introduction to probability and statistics , 1994 .

[19]  Lukás Burget,et al.  Comparison of scoring methods used in speaker recognition with Joint Factor Analysis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  John H. L. Hansen,et al.  Dialect/Accent Classification Using Unrestricted Audio , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Hans Frede Nielsen From Dialect to Standard: English in England 1154-1776 , 2005 .

[22]  Sébastien Marcel,et al.  Bob: a free signal processing and machine learning toolbox for researchers , 2012, ACM Multimedia.

[23]  Keikichi Hirose,et al.  Analysis of Voice Fundamental Frequency Contours of Continuing and Terminating Phrases of Four Swiss German Dialects , 2009 .

[24]  Julia Hirschberg,et al.  Automatic Dialect and Accent Recognition and its Application to Speech Recognition , 2011 .

[25]  Elmar Nöth,et al.  Age and gender recognition for telephone applications based on GMM supervectors and support vector machines , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Jean-Philippe Goldman,et al.  EasyAlign: An Automatic Phonetic Alignment Tool Under Praat , 2011, INTERSPEECH.

[27]  Julia Hirschberg,et al.  Dialect recognition using a phone-GMM-supervector-based SVM kernel , 2010, INTERSPEECH.

[28]  Jean-Luc Gauvain,et al.  Combining speaker identification and BIC for speaker diarization , 2005, INTERSPEECH.

[29]  Daniel P. W. Ellis,et al.  Dialect and Accent Recognition Using Phonetic-Segmentation Supervectors , 2011, INTERSPEECH.

[30]  Stephen J. Cox,et al.  Iterative classification of regional British accents in i-vector space , 2012, MLSLP.

[31]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[32]  Philippe Boula de Mareüil,et al.  Identification of regional accents in French: perception and categorization , 2006, INTERSPEECH.

[33]  Isabel Trancoso,et al.  Accent identification , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[34]  Nikos Fakotakis,et al.  Phonotactic Recognition of Greek and Cypriot Dialects from Telephone Speech , 2008, SETN.

[35]  Martin J. Russell,et al.  Human and computer recognition of regional accents and ethnic groups from British English speech , 2013, Comput. Speech Lang..