Estimation of dimensionality in canonical correlation analysis

SUMMARY Two methods are given for the estimation of the dimensionality in canonical correlation analysis. One is based on Mallows's CG statistic in regression. The other uses an information criterion for choice of models. Monte Carlo comparison of the sequential Bartlett-Lawley test procedure and the two methods are presented. The two methods are applied to the estimation of the dimensionality in canonical variate analysis. In the interpretation of the relationships between two sets of variates, the number of nonzero population canonical correlations may be called the dimensionality. A method for estimating the dimensionality is to use the sequential test procedure (Bartlett, 1941) based on the Bartlett-Lawley statistic Lk(ox) for testing with significance level a the hypothesis that the dimension is k. We obtain other criteria Ck and Ak for determining the dimensionality, based on Mallows's idea and an information criterion respectively for choice of models. We also obtain a modified criterion Yk. A comparison between the methods Lk(01), Lk(O0O5), Lk(OOl), Ck' Ok and Ak will be given by using simulation. A similar estimation problem arises in canonical variate analysis. It is shown that the two methods presented in this paper can be applied to the estimation of the dimensionality in the canonical variates procedure due to Fisher (1938).