OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification

The characteristics of the OPLS method have been investigated for the purpose of discriminant analysis (OPLS‐DA). We demonstrate how class‐orthogonal variation can be exploited to augment classification performance in cases where the individual classes exhibit divergence in within‐class variation, in analogy with soft independent modelling of class analogy (SIMCA) classification. The prediction results will be largely equivalent to traditional supervised classification using PLS‐DA if no such variation is present in the classes. A discriminatory strategy is thus outlined, combining the strengths of PLS‐DA and SIMCA classification within the framework of the OPLS‐DA method. Furthermore, resampling methods have been employed to generate distributions of predicted classification results and subsequently assess classification belief. This enables utilisation of the class‐orthogonal variation in a proper statistical context. The proposed decision rule is compared to common decision rules and is shown to produce comparable or less class‐biased classification results. Copyright © 2007 John Wiley & Sons, Ltd.

[1]  Daniel Eriksson,et al.  MASQOT: a method for cDNA microarray spot quality control , 2005, BMC Bioinformatics.

[2]  Stephen J. Bruce,et al.  Extraction, interpretation and validation of information for comparing samples in metabolic LC/MS data sets. , 2005, The Analyst.

[3]  Bernd Beck,et al.  A support vector machine approach to classify human cytochrome P450 3A4 inhibitors , 2005, J. Comput. Aided Mol. Des..

[4]  D. Gauguier,et al.  Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets. , 2005, Analytical chemistry.

[5]  Johan Trygg Prediction and spectral profile estimation in multivariate calibration , 2004 .

[6]  S Bicciato,et al.  Marker Identification and Classification of Cancer Types Using Gene Expression Data and SIMCA , 2004, Methods of Information in Medicine.

[7]  Paul A. Smith,et al.  Comparison of Linear and Nonlinear Classification Algorithms for the Prediction of Drug and Chemical Metabolism by Human UDP-Glucuronosyltransferase Isoforms , 2003, J. Chem. Inf. Comput. Sci..

[8]  T. Ebbels,et al.  Improved analysis of multivariate data by variable stability scaling: application to NMR-based metabolic profiling , 2003 .

[9]  M. Tenenhaus,et al.  Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach , 2003, Human Genetics.

[10]  A. Belousov,et al.  A flexible classification approach with optimal generalisation performance: support vector machines , 2002 .

[11]  J. Trygg O2‐PLS for qualitative and quantitative analysis in multivariate calibration , 2002 .

[12]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[13]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..

[14]  Age K. Smilde,et al.  Direct orthogonal signal correction , 2001 .

[15]  E Holmes,et al.  Chemometric models for toxicity classification based on NMR spectra of biofluids. , 2000, Chemical research in toxicology.

[16]  S. Wold,et al.  Orthogonal signal correction of near-infrared spectra , 1998 .

[17]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[18]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[19]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[20]  J. Overall,et al.  Applied multivariate analysis , 1983 .

[21]  Erik Johansson,et al.  Four levels of pattern recognition , 1978 .

[22]  Svante Wold,et al.  Pattern recognition by means of disjoint principal components models , 1976, Pattern Recognit..

[23]  H. Hotelling The Generalization of Student’s Ratio , 1931 .