Numerous Biometric modalities have emerged recently as researchers are continuously striving to achieve higher security standards for applications that use biometrics to provide access to users in critical environment. Gait of a person, Thermo gram, Near Infrared Images, Smile Recognition, Lip Movement recognition, Thermal Palm, Hand Finger knuckle, Finger veins, Nail ID, Skin Spectroscopy, Electrocardiogram, Dental Biometrics and DNA are some of the biometric techniques being used recently.Speech is simplest, nonintrusive, unimodal biometric parameter. However acoustic degradation, background noise, channel noise, health and ageing of an individual affect the quality of voice which makes Speaker Recognition System not very efficient. Hence, pre-processing and voice enhancing techniques and strategies need to be considered. The vulnerability lies when the testing and enrolment conditions are not perfectly matched which is often the case in real time environment. Degraded speech signals hence limit the effectiveness of Speaker identification and verification. In the proposed work, speaker identification has been carried out using gammatone frequency cepstral coefficients (GFCC) as these found robust to noise than other popular speech features i.e. Mel-frequency cepstral coefficients (MFCC). To divide the speech features in small space, clustering has been carried out using Gaussian mixtures which provide combined feature-set properties of a speech sample. Further these features are fed to pattern recognition based neural network which classifies the data into particular speaker identification. Experimental results found 92% of accuracy for the collected dataset.
[1]
Qi Li,et al.
An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions
,
2011,
IEEE Transactions on Audio, Speech, and Language Processing.
[2]
Hynek Hermansky,et al.
RASTA-PLP speech analysis technique
,
1992,
[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[3]
Vaishali Kulkarni,et al.
A comparison of performance evaluation of ASR for noisy and enhanced signal using GMM
,
2016,
2016 International Conference on Computing, Analytics and Security Trends (CAST).
[4]
M. Picheny,et al.
Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences
,
2017
.
[5]
H Hermansky,et al.
Perceptual linear predictive (PLP) analysis of speech.
,
1990,
The Journal of the Acoustical Society of America.
[6]
Alejandro Acero,et al.
Acoustical and environmental robustness in automatic speech recognition
,
1991
.
[7]
Lawrence R. Rabiner,et al.
A tutorial on hidden Markov models and selected applications in speech recognition
,
1989,
Proc. IEEE.
[8]
Fahim Ahmed,et al.
Text dependent and independent speaker recognition using neural responses from the model of the auditory system
,
2017,
2017 International Conference on Electrical, Computer and Communication Engineering (ECCE).