Parametric feature-based voice recognition system using artificial neural network

A speaker identification experiment was conducted using an artificial neural network. The speech data were collected from nine different speakers saying the same word, "hello". The speech data were then preprocessed for signal conditioning. Fourteen feature parameters were obtained: 12 of them are the coefficients of the 12/sup th/ order linear predictor (LPC); and the other two were selected as the peak and bandwidth of the spectral envelope. These 14 feature parameters then served as the inputs to the neural network for speaker classification. A standard two-layer feedforward neural network was trained to identify different feature sets associated with the corresponding speakers. The network size was selected to be 14-8-4 (14 input, 8 hidden, and 4 output units). Nine utterances from each speaker were used as training data, and the other one served as testing data. The results showed that the trained network can correctly identify the speakers to an accuracy of 90%. The success rate could be increased by increasing the number of utterances per speaker.<<ETX>>