Site scores and conditional biplots in canonical correspondence analysis

Canonical correspondence analysis is an important multivariate technique in community ecology. It produces an interesting biplot that summarizes the data matrices involved in the analysis. The method produces two sets of site scores that can be used in a biplot. One set concerns site scores that are weighted averages of the species scores (WA scores), and the other set represents site scores that are linear combinations of the environmental variables (LC scores). We show that the use of both sets of scores in a CCA biplot can be justified. The use of the WA scores leads to the best possible representation of the species data conditional on the representation of the weighted averages. Likewise, the LC scores lead to the best possible representation of the environmental variables, also conditional on the representation of the weighted averages and on the use of a Mahalanobis metric. The eigenvalues obtained in CCA indicate how well the species data are represented when LC scores are used. The quality of representation of the species data when WA scores are used can be computed from the CCA eigenvalues and the variances of the WA scores. Scalar products between WA scores and environmental variable vectors do not form a biplot of the environmental data. Theoretical results are illustrated with Australian data from freshwater ecology. Copyright © 2003 John Wiley & Sons, Ltd.