Speaker verification/recognition and the importance of selective feature extraction: review

Speaker Recognition (SR) is the process of automatically recognizing the person speaking on the basis of the information obtained from the speech features. SR process involves Speaker verification (SV) and Speaker Identification (SI). Automatic Speaker verification (ASV) is the process of authenticating the true identity of the speaker. ASV is generally accomplished in four steps. The first step is the digital speech data acquisition. In the second step, feature extraction and feature selection are performed. The third step involves clustering the feature vectors and storing in a database. Decision-making through Pattern matching is the last step. In this paper, the main techniques followed in each of the above steps are reviewed. The importance of feature vector extraction, selection and normalization are also discussed.