A fast search method of speaker identification for large population using pre-selection and hierarchical matching

Performance of search during matching phase in a speaker identification system realized through vector quantization (VQ) is investigated in this paper. Voice of each person is recorded in a office room with personal computers. LPC−cepstrum is selected as feature vector. In order to gain higher success rate of identification, it is necessary to use larger size codebook for each person. Consequently, it is extremely time−consuming to do full matching directly with all of registered larger size codebooks for large population to determine who the current speaker is. To fulfill identification at a tolerable speed, trade−off between success rate and search time is unavoidable usually. In this paper, a fast search method is proposed that is based on pre −selection and hierarchical matching so as to eliminate a large amount of impossible candidates before doing final fine matching with larger size codebooks meanwhile keep the success rate not degraded. Pre −selection is achieved using divergences between unknown input speech and all registered codebooks. Hierarchical matching is implemented using smaller size codebook and larger size codebook corresponding to each person respectively. With this method, for a population of 290, time required for identification can be reduced to about 13% while memory occupation is increased by just 6% when baselines are taken as those using larger size codebooks in conventional way. Success rate is not degraded as 94% comparing to that acquired in conventional way as well.