Performance Analysis of Feature sets in Speaker Diarization techniques

Speech is the most important communication among humans. Processing of speech signal has many strategies including speech coding, speaker recognition, speaker verification, etc. Speaker diarization is the pre-processing stage for many applications of speaker recognition systems. Speaker Diarization is the mission of determining “who Spoke when” for any audio recording that carries an unknown quantity of records and an unknown variety of audio systems. Speaker diarization has come to be achief era for many tasks like navigation, retrieval, or higher-level interference on audio data. It mainly performs three operations feature extraction, voice activity detection, and classification. In this paper, we’ve reviewed the few speaker diarization Techniques. The trendy speaker diarization structures finished nice outcomes. In this paper, few speaker diarization device performances are evaluated for Diarization mistakes, Tracking time, and False alarm.

[1]  Kasiprasad Mannepalli,et al.  MFCC-GMM based accent recognition system for Telugu speech signals , 2015, International Journal of Speech Technology.

[2]  Mike Brookes,et al.  Speaker change detection and speaker diarization using spatial information , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Liping Zhu A modified approach to cluster refinement for speaker diarization , 2015, 2015 4th International Conference on Computer Science and Network Technology (ICCSNT).

[4]  Mijanur Rahman,et al.  Continuous Bangla Speech Segmentation using Short-term Speech Features Extraction Approaches , 2012 .

[5]  B. Rajakumar The Lion's Algorithm: A New Nature-Inspired Search Algorithm , 2012 .

[6]  Dong Wang,et al.  A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Kasiprasad Mannepalli,et al.  Analysis of Emotion Recognition System for Telugu Using Prosodic and Formant Features , 2018 .

[8]  Bayya Yegnanarayana,et al.  Determining Number of Speakers From Multispeaker Speech Signals Using Excitation Source Information , 2007, IEEE Signal Processing Letters.

[9]  Douglas A. Reynolds,et al.  An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Kasiprasad Mannepalli,et al.  Accent Recognition System Using Deep Belief Networks for Telugu Speech Signals , 2016, FICTA.

[11]  Chih-Hung Wu,et al.  A New Fuzzy Clustering Validity Index With a Median Factor for Centroid-Based Clustering , 2015, IEEE Transactions on Fuzzy Systems.

[12]  Paavo Alku,et al.  Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions , 2010, INTERSPEECH.

[13]  Nicholas W. D. Evans,et al.  Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Suryakanth V. Gangashetty,et al.  A Study on Text-Independent Speaker Recognition Systems in Emotional Conditions Using Different Pattern Recognition Models , 2016, MIKE.

[15]  Henrik Schulz,et al.  Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign , 2012, EURASIP J. Audio Speech Music. Process..