FastRec : A fast and robust text independent speaker recognition system for radio networks

This paper proposes a fast and robust text-independent speaker identification system for all types of radio networks. The radio-conversations contain speech from various speakers along with radio noise. A novel approach to segment the radio-conversations into speaker homogenous speech segments named as Reciever Noise Segmentation (RxNSeg) is proposed which first identifies the receiver radio-noise and then finds the boundaries for speaker homogeneous speech segments in the radio-conversation. Various techniques for clustering of speech segments to arrive at speaker homogenous clusters to train speaker models are evaluated. A novel top-down approach named as Find One Long Speech Segment (FOLSS) for finding at least one long speaker homogenous segment for each speaker present in a radio-conversation is proposed in lieu of traditional clustering techniques. Speaker modeling using Gaussian Mixture Model (GMM) and adapted-GMM are considered. The two speaker modeling methods with proposed RxNSeg and FOLSS show an average 86:32% reduction in testing time without significant loss of speaker identification accuracy as com-pared to traditional segmentation and clustering techniques.