A speaker identification experiment was conducted using an artificial neural network. The speech data were collected from nine different speakers saying the same word, "hello". The speech data were then preprocessed for signal conditioning. Fourteen feature parameters were obtained: 12 of them are the coefficients of the 12/sup th/ order linear predictor (LPC); and the other two were selected as the peak and bandwidth of the spectral envelope. These 14 feature parameters then served as the inputs to the neural network for speaker classification. A standard two-layer feedforward neural network was trained to identify different feature sets associated with the corresponding speakers. The network size was selected to be 14-8-4 (14 input, 8 hidden, and 4 output units). Nine utterances from each speaker were used as training data, and the other one served as testing data. The results showed that the trained network can correctly identify the speakers to an accuracy of 90%. The success rate could be increased by increasing the number of utterances per speaker.<<ETX>>
[1]
W. M. Carey,et al.
Digital spectral analysis: with applications
,
1986
.
[2]
R. Kumaresan,et al.
Estimation of frequencies of multiple sinusoids: Making linear prediction perform like maximum likelihood
,
1982,
Proceedings of the IEEE.
[3]
B.S. Atal,et al.
Automatic recognition of speakers from their voices
,
1976,
Proceedings of the IEEE.
[4]
B. Atal,et al.
Improved quantizer for adaptive predictive coding of speech signals at low bit rates
,
1980,
ICASSP.
[5]
Lennart Ljung,et al.
System Identification: Theory for the User
,
1987
.
[6]
M. R. Schroeder,et al.
Adaptive predictive coding of speech signals
,
1970,
Bell Syst. Tech. J..
[7]
G.R. Doddington,et al.
Speaker recognition—Identifying people by their voices
,
1985,
Proceedings of the IEEE.