Speaker recognition based on a weighted acoustic discrimination

We combine multiple-mixture single-state Markov models with phonetic classification in order to improve the performance of a speaker recognition system. Three broad phonetic classes: voiced frames, unvoiced frames and transitions, are defined. We design speaker templates by the parallel connection of the weighted outputs of three single state HMM's. Each model corresponds with a distinct sound class and the output weights take into account the perceptual influences across phonetic classes. The preliminary results show that this novel architecture outperforms its counterpart without phonetic classification.