Multivariate analysis of aquatic toxicity data with PLS

A common task in data analysis is to model the relationships between two sets of variables, the descriptor matrixX and the response matrixY. A typical example in aquatic science concerns the relationships between the chemical composition of a number of samples (X) and their toxicity to a number of different aquatic species (Y). This modelling is done in order to understand the variation ofY in terms of the variation ofX, but also to lay the ground for predictingY of unknown observations based on their knownX-data. Correlations of this type are usually expressed as regression models, and are rather common in aquatic science. Often, however, the multivariateX andY matrices invalidate the use of multiple linear regression (MLR) and call for methods which are better suited for collinear data. In this context, multivariate projection methods represent a highly useful alternative, in particular, partial least squares projections to latent structures (PLS). This paper introduces PLS, highlights its strengths and presents applications of PLS to modelling aquatic toxicity data. A general discussion of regression, comparing MLR and PLS, is provided.

[1]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[2]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[3]  Richard E. Speece,et al.  Determining chemical toxicity to aquatic species , 1990 .

[4]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[5]  M. Hill,et al.  Data analysis in community and landscape ecology , 1987 .

[6]  S. Wold,et al.  Multivariate Data Analysis in Chemistry , 1984 .

[7]  Joop L. M. Hermens,et al.  Quantitative Structure-Activity Relationships of Environmental Pollutants , 1989 .

[8]  R. C. Weast Handbook of chemistry and physics , 1973 .

[9]  M. Sjöström,et al.  Modelling the Toxicity of Organophosphates: a Comparison of the Multiple Linear Regression and PLS Regression Methods , 1994 .

[10]  Joop L. M. Hermens,et al.  QSAR study of the toxicity of nitrobenzene derivatives towards Daphnia magna, Chlorella pyrenoidosa and Photobacterium phosphoreum , 1989 .

[11]  J. Topliss,et al.  Chance factors in studies of quantitative structure-activity relationships. , 1979, Journal of medicinal chemistry.

[12]  I. Wakeling,et al.  A test of significance for partial least squares regression , 1993 .

[13]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[14]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[15]  Bernard Testa,et al.  Octan-1-ol–water partition coefficients of zwitterionic α-amino acids. Determination by centrifugal partition chromatography and factorization into steric/hydrophobic and polar components , 1992 .

[16]  Gary M. Mullet,et al.  Why Regression Coefficients Have the Wrong Sign , 1976 .

[17]  R. C. Weast CRC Handbook of Chemistry and Physics , 1973 .

[18]  Johan F. J. Engbersen,et al.  Addition of cyanide ion to nicotinamide cations in acetonitrile. Formation of non-productive charge-transfer complexes , 1990 .

[19]  Joop L. M. Hermens,et al.  Quantitative structure-activity relationships for the toxicity and bioconcentration factor of nitrobenzene derivatives towards the guppy (Poecilia reticulata) , 1987 .