论文信息 - FastRec : A fast and robust text independent speaker recognition system for radio networks

FastRec : A fast and robust text independent speaker recognition system for radio networks

This paper proposes a fast and robust text-independent speaker identification system for all types of radio networks. The radio-conversations contain speech from various speakers along with radio noise. A novel approach to segment the radio-conversations into speaker homogenous speech segments named as Reciever Noise Segmentation (RxNSeg) is proposed which first identifies the receiver radio-noise and then finds the boundaries for speaker homogeneous speech segments in the radio-conversation. Various techniques for clustering of speech segments to arrive at speaker homogenous clusters to train speaker models are evaluated. A novel top-down approach named as Find One Long Speech Segment (FOLSS) for finding at least one long speaker homogenous segment for each speaker present in a radio-conversation is proposed in lieu of traditional clustering techniques. Speaker modeling using Gaussian Mixture Model (GMM) and adapted-GMM are considered. The two speaker modeling methods with proposed RxNSeg and FOLSS show an average 86:32% reduction in testing time without significant loss of speaker identification accuracy as com-pared to traditional segmentation and clustering techniques.

Milan Patnaik | Debasish Pradhan | Ajay Mathew | M. S. Gill

[1] Douglas A. Reynolds,et al. Blind clustering of speech utterances based on speaker and language characteristics , 1998, ICSLP.

[2] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[3] John H. L. Hansen,et al. Efficient audio stream segmentation via the combined T/sup 2/ statistic and Bayesian information criterion , 2005, IEEE Transactions on Speech and Audio Processing.

[4] Dominique Fohr,et al. Speaker diarization using normalized cross likelihood ratio , 2007, INTERSPEECH.

[5] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.