Speaker Recognition Using Neural Tree Networks

A new classifier is presented for text-independent speaker recognition. The new classifier is called the modified neural tree network (MNTN). The NTN is a hierarchical classifier that combines the properties of decision trees and feed-forward neural networks. The MNTN differs from the standard NTN in that a new learning rule based on discriminant learning is used, which minimizes the classification error as opposed to a norm of the approximation error. The MNTN also uses leaf probability measures in addition to the class labels. The MNTN is evaluated for several speaker identification experiments and is compared to multilayer perceptrons (MLPs), decision trees, and vector quantization (VQ) classifiers. The VQ classifier and MNTN demonstrate comparable performance and perform significantly better than the other classifiers for this task. Additionally, the MNTN provides a logarithmic saving in retrieval time over that of the VQ classifier. The MNTN and VQ classifiers are also compared for several speaker verification experiments where the MNTN is found to outperform the VQ classifier.

[1]  J. Oglesby,et al.  Optimisation of neural models for speaker identification , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[3]  Wray L. Buntine,et al.  Learning classification trees , 1992 .

[4]  Richard J. Mammone,et al.  Speaker recognition using the modified neural tree network , 1993 .

[5]  Biing-Hwang Juang,et al.  A vector quantization approach to speaker recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Richard J. Mammone,et al.  Growing and Pruning Neural Tree Networks , 1993, IEEE Trans. Computers.

[7]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .