论文信息 - Speaker Discrimination Based on a Fusion Between Neural and Statistical Classifiers

Speaker Discrimination Based on a Fusion Between Neural and Statistical Classifiers

Speaker discrimination consists in checking whether two (or more) speech segments belong to the same speaker or not. In this framework, we propose a new approach developed for the task of speaker discrimination, this approach results from the fusion between a neural network classifier (NN) and a statistical classifier, this fusion is obtained once by combining the scores of the simple classifiers weighted by some confidence coefficients and another time, by using the scores of the statistical classifier as an additional input of the Multi-Layer Perceptron (MLP), in order to optimize the NN training (Hybrid model).

Siham Ouamour | Halim Sayoud

[1] Younès Bennani. Approches connexionnistes pour la reconnaissance automatique du locuteur : modelisation & identification , 1992 .

[2] Petr Motlícek,et al. Employment of Subspace Gaussian Mixture Models in speaker recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3] Steve Young,et al. The development of the 1996 HTK broadcast news transcription system , 1996 .

[4] Ivan Magrin-Chagnolleau,et al. Second-order statistical measures for text-independent speaker identification , 1995, Speech Commun..

[5] Mhania Guerti,et al. A new relativistic vision in speaker discrimination , 2008 .

[6] H. S. Lee,et al. Application of multi-layer perceptron in estimating speech/noise characteristics for speech recognition in noisy environment , 1995, Speech Commun..

[7] Niaz Uddin Mahmud,et al. Text Dependent Speaker Identification using Hidden Markchov Model and Mel Frequency Cepstrum Coefficient , 2014 .

[8] Douglas A. Reynolds,et al. Deep Neural Network Approaches to Speaker and Language Recognition , 2015, IEEE Signal Processing Letters.

[9] S Ouamour,et al. Looking for the best spectral resolution in automatic speaker recognition , 2006, 2006 IEEE GCC Conference (GCC).

[10] Belur V. Dasarathy,et al. Decision fusion , 1994 .

[11] Halim Sayoud,et al. Speaker Detection on Telephone Calls Using Fusion between SVMs and Statistical Measures , 2013, 2013 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.