Standard and target driven AR-vector models for speech analysis and speaker recognition

Theoretical aspects and practical applications are reported of two variants of the AR-vector modeling technique: the standard AR-vector model and the target driven (or multistep excited) AR-vector model. The standard version supposes a white excitation, while the target driven model assumes a piecewise constant input. The standard AR-vector model turns out to be extremely efficient for speaker recognition, since, for a set of 420 different speakers, the recognition score ranges from 93% to 100%, depending on the duration of the test speech sample. The target driven AR-vector model shows very interesting properties for speech analysis and segmentation. There exists a strong correspondence between the steps in the input function and the underlying phonetic content of speech. Moreover, under some normalization, the values of the steps can be interpreted as acoustic targets.<<ETX>>