论文信息 - Fusion of Face and Speech Features with Artificial Neural Network for Speaker Authentication

Fusion of Face and Speech Features with Artificial Neural Network for Speaker Authentication

Abstract Biometric person identity authentication is gaining more and more attention. The authentication task performed by an expert is a binary classification problem: reject or accept identity claim. Combining experts, each based on a different modality (speech, face, fingerprint, etc.), increases the performance and robustness of identity authentication systems. In this context, a key issue is the fusion of the different experts for taking a final decision (i.e., accept or reject identity claim). An automatic speaker authentication system based solely on speech or faces is often not able to meet the system performance requirements. We have developed a prototype biometric system, which integrates faces and speech utterances. The system overcomes the limitations of face recognition systems as well as speech based verification systems. The work here is broadly classified into three parts; firstly, extractions of speech parameters, secondly extractions of image parameters and finally the simulated Artificial neural network ( ANN) model in the MATLAB version 7.0.1 has been designed by fusion of speech, image and neural network for authentication of Speaker. The ANN model is trained by well-known Back-propagation algorithm.

Anupam Shukla | Ritu Tiwari | R. Tiwari | A. Shukla

[1] Tanzeem Choudhury,et al. Multimodal person recognition using unconstrained audio and video , 1998 .

[2] Souheil Ben-Yacoub. Multi-Modal Data Fusion for Person Authentication using SVM , 1998 .

[3] E. Mayoraz,et al. Fusion of face and speech data for person identity verification , 1999, IEEE Trans. Neural Networks.

[4] Stefan Fischer,et al. Expert Conciliation for Multi Modal Person Authentication Systems by Bayesian Statistics , 1997, AVBPA.

[5] Anil K. Jain,et al. Integrating Faces and Fingerprints for Personal Identification , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6] Sabri Gurbuz,et al. Noise-based audio-visual fusion for robust speech recognition , 2001, AVSP.

[7] Jiri Matas,et al. On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8] J. Flanagan,et al. Simulation and Visualization of Articulatory Trajectories Estimated from Speech Signals , 2022 .