Maximum Likelihood Linear Regression 32 . 1 Maximum likelihood linear regression

Maximum likelihood linear regression (MLLR) is an adaptation technique suitable for both speaker and environmental model-based adaptation. The models are adapted using a set of linear transformations, estimated in a maximum likelihood fashion from the available adaptation data. As these transformations can capture general relationships between the original model set and the current speaker, or new acoustic environment, they can be e ective in adapting all the HMM distributions with limited adaptation data. Two important decisions that must be made are (i) how to cluster components together, such that they all have a similar transformation matrix, and (ii) how many transformation matrices to generate for a given block of adaptation data. This paper addresses both problems. Firstly it describes two optimal clustering techniques, in the sense of maximising the likelihood of the adaptation data. The rst assigns each component to one of the regression classes. This may be used to generate standard regression class trees. The second scheme performs a fuzzy assignment of base class to regression class, so the transformation associated with each component is a linear combination of a set of transformations. Secondly two schemes are examined which address the problem of how to determine the number of regression classes, transforms, for a given amount of adaptation data. Two schemes are examined here. A cross-validation scheme based on the auxiliary function of the adaptation data is described. Another scheme based on the use of iterative MLLR is also detailed. Both these schemes require no a-priori thresholding information. An initial evaluation of the techniques was performed using data from the ARPA 1994 test data. On this task, though \good" trees, in terms of the likelihood of the adaptation training data were generated, neither of the optimal clustering schemes yielded gains in recognition performance. The performance of the cross-validation scheme was found to be comparable to an empirically determined threshold scheme. The best performance was achieved using iterative MLLR, which outperformed both xed classes and threshold based schemes.