Real Time Implementation of Speaker Recognition System with MFCC and Neural Networks on FPGA

Background: Speaker recognition systems plays a pivotal role in the field of forensics, security and biometric authentication for verifying or identifying the speaker from the group of speakers. Methods: This paper gives a brief introduction about developing a hardware based speaker recognition system using Mel Frequency Cepstral Coefficients (MFCC) which are extracted from input speech signal to linearize the frequency scale at higher frequencies and Perceptron Neural Networks to provide layer weights for verifying the speaker identity to compare the output in the database of stored speaker identities. Findings: The input speech features are extracted using blocking and windowing to reduce noise and get the audio samples to store in the RAM where sampled data is converted into frequency domain using FFT to get the Cepstral Coefficients which are normalised and fed to neural network tool box present in the MATLAB to obtain layer weights for given set of data and the output is compared with the saved speaker identities to find a match. The decision making logic is written in NIOS II processor of FPGA where the taken input features are compared to the existing database of speaker identities with the help of perceptron neural network layer weights which gives the nearest possibility of the match in the database of the group of speakers. The designed system has been tested using two speakers as reference where the vowels spoken by them are taken into account to compare with the database of speakers already stored in FPGA. Conclusion/ Improvements: The probability of detection of the speakers is 80% and verifying the speaker is quite accurate in hardware based systems than in software based systems where performance factor is less. The given performance in the designed system can be increased by retraining the neural networks which can provide nearly 90% in detecting the speaker.