Correcting for BIAS in the Canonical Redundancy Statistic

The canonical redundancy statistic, an estimate of the amount of shared variance between two sets of variables, has been proposed as an alternative to the squared canonical correlation coefficient in the interpretation of the results of canonical correlation analysis. The present study was undertaken to investigate the bias of the canonical redundancy statistic and to evaluate methods to correct for any bias. Using Monte Carlo methods, the redundancy statistic was found to exhibit an amount of bias similar to that of the first squared canonical correlation coefficient. Bias is most affected by sample size: bias was observed to decrease by half each time the sample size was doubled. Bias also decreases as the intercorrelations between the two sets of variables increase. The number of predictor and criterion variables, as well as the size of the correlations between variables in each set, have relatively minimal effect on bias. Two formulae, Wherry and Olkin-Pratt, that were developed to estimate the population value of the squared multiple correlation coefficient, were found to correct adequately the bias of the redundancy statistic. As an example of the use and interpretation of canonical redundancy analysis, a set of data relating consumer health values and health behaviors was analyzed. While the results indicate that only a small amount of redundant information is shared between these two sets of measures, the redundancy statistic provides a more conservative interpretation of this overlap than do the squared canonical correlation coefficients.