The paper provides three different schemes for speaker identification of personnels from their voice using artificial neural networks. The first scheme recognizes speakers by employing the classical back-propagation algorithm pre-trained with known voice samples of the persons. The second scheme provides a framework for classifying the known training samples of the voice features using a hierarchical architecture realized with a self-organizing feature map neural net. The first scheme is highly robust as it is capable of identifying the personnels from their noisy voice samples, but because of its excessive training time it has limited applications for a large voice database. The second scheme though not so robust like the former, however, can classify an unknown voice sample to its nearest class. The time needed for classification by the first scheme is always unique irrespective of the voice sample. It is proportional to the number of feed-forward layers in the network. The time-requirement of the second classification scheme, however, is not free from the voice features and is proportional to the number of 2-D arrays traversed by the algorithm on the hierarchical structure. The third scheme is highly robust and mis-classification is as low as 0.2 per cent. The third scheme combines the composite benefits of a radial basis function neural net and back-propagation trained neural net.
[1]
M.H. Hassoun,et al.
Fundamentals of Artificial Neural Networks
,
1996,
Proceedings of the IEEE.
[2]
Amit Konar,et al.
Artificial Intelligence and Soft Computing: Behavioral and Cognitive Modeling of the Human Brain
,
1999
.
[3]
Edward Thomas Doherty.
Evaluation of selected acoustic parameters for use in speaker identification
,
1975
.
[4]
Lalit R. Bahl,et al.
A Maximum Likelihood Approach to Continuous Speech Recognition
,
1983,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5]
Madan M. Gupta,et al.
Neuro-Vision Systems: Principles and Applications
,
1995
.