论文信息 - Speaker Recognition Using Occurrence Pattern of Speech Signal

Speaker Recognition Using Occurrence Pattern of Speech Signal

Speaker recognition is a highly studied area in the field of speech processing. Its application domains are many ranging from the forensic sciences to telephone banking and intelligent voice-driven applications such as answering machines. The area of study of this paper is a sub-field of speaker recognition called speaker identification. A new approach for tackling this problem with the use of one of the most powerful features of audio signals i.e. MFCC is proposed in this paper. Our work also makes use of the concept of co-occurrence matrices and derives statistical measures from it which are incorporated into the proposed feature vector. Finally, we apply a classifier which correctly identifies the person based on their speech sample. The work proposed here is perhaps one of the first to make use of such an arrangement, and results show that it is a highly promising strategy.

Ghazaala Yasmin | Arijit Ghosal | Saptarshi Sengupta

[1] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[2] Paula López-Otero,et al. Improved strategies for speaker segmentation and emotional state detection , 2015 .

[3] Kinshuk Dudeja,et al. Applications of Digital Signal Processing to Speech Recognition , 2015 .

[4] Y. Venkataramani,et al. Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach , 2009 .

[5] Jr. J.P. Campbell,et al. Speaker recognition: a tutorial , 1997, Proc. IEEE.

[6] Eliathamby Ambikairajah,et al. Investigation of Spectral Centroid Magnitude and Frequency for Speaker Recognition , 2010, Odyssey.

[7] Petri Toiviainen,et al. A Matlab Toolbox for Music Information Retrieval , 2007, GfKl.

[8] Hui-hong Xu. Text Dependent Speaker Recognition Study , 2015 .

[9] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[10] R.M. Haralick,et al. Statistical and structural approaches to texture , 1979, Proceedings of the IEEE.

[11] Tyler K. Perrachione. Speaker recognition across languages , 2017 .

[12] P. Mermelstein,et al. Distance measures for speech recognition, psychological and instrumental , 1976 .

[13] George R. Doddington,et al. Speaker recognition based on idiolectal differences between speakers , 2001, INTERSPEECH.

[14] Ranjan Parekh,et al. AUTOMATED SPEECH RECOGNITION OF ISOLATED WORDS USING NEURAL NETWORKS , 2011 .