Unsupervised speaker adaptation by probabilistic spectrum fitting

A general approach to speaker adaptation in speech recognition is described, in which speaker differences are treated as arising from a parameterized transformation. Given some unlabeled data from a particular speaker, a process is described which maximizes the likelihood of this data by estimating the transformation parameters at the same time as refining estimates of the labels. The technique is illustrated using isolated vowel spectra and phonetically motivated linear spectrum transformations and is shown to give significantly better performance than nonadaptive classification.<<ETX>>