Global, local and unique decompositions in OnPLS for multiblock data analysis.

OnPLS is an extension of O2PLS that decomposes a set of matrices, in either multiblock or path model analysis, such that each matrix consists of two parts: a globally joint part containing variation shared with all other connected matrices, and a part that contains locally joint and unique variation, i.e. variation that is shared with some, but not all, other connected matrices or that is unique in a single matrix. A further extension of OnPLS suggested here decomposes the part that is not globally joint into locally joint and unique parts. To achieve this it uses the OnPLS method to first find and extract a globally joint model, and then applies OnPLS recursively to subsets of matrices that contain the locally joint and unique variation remaining after the globally joint variation has been extracted. This results in a set of locally joint models. The variation that is left after the globally joint and locally joint variation has been extracted is (by construction) not related to the other matrices and thus represents the strictly unique variation in each matrix. The method's utility is demonstrated by its application to both a simulated data set and a real data set acquired from metabolomic, proteomic and transcriptomic profiling of three genotypes of hybrid aspen. The results show that OnPLS can successfully decompose each matrix into global, local and unique models, resulting in lower numbers of globally joint components and higher intercorrelations of scores. OnPLS also increases the interpretability of models of connected matrices, because of the locally joint and unique models it generates.

[1]  Chong-sun Kim Canonical Analysis of Several Sets of Variables , 1973 .

[2]  Michel Tenenhaus,et al.  A Bridge Between PLS Path Modeling and Multi-Block Data Analysis , 2010 .

[3]  J. Geer Linear relations amongk sets of variables , 1984 .

[4]  J. van der Greef,et al.  The role of analytical sciences in medical systems biology. , 2004, Current opinion in chemical biology.

[5]  H. Wold Nonlinear Iterative Partial Least Squares (NIPALS) Modelling: Some Current Developments , 1973 .

[6]  Johan Trygg,et al.  Advantages of orthogonal inspection in chemometrics , 2012 .

[7]  Kazuki Saito,et al.  Integrated omics approaches in plant systems biology. , 2009, Current opinion in chemical biology.

[8]  Johan Trygg,et al.  Integrated analysis of transcript, protein and metabolite data to study lignin biosynthesis in hybrid aspen. , 2009, Journal of proteome research.

[9]  V. E. Vinzi,et al.  PLS regression, PLS path modeling and generalized Procrustean analysis: a combined approach for multiblock analysis , 2005 .

[10]  S. de Jong,et al.  A framework for sequential multiblock component methods , 2003 .

[11]  Svante Wold,et al.  Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection , 1996 .

[12]  Michel Tenenhaus,et al.  PLS path modeling , 2005, Comput. Stat. Data Anal..

[13]  Tormod Næs,et al.  Preference mapping by PO-PLS: Separating common and unique information in several data blocks , 2012 .

[14]  Tormod Næs,et al.  Regression models with process variables and parallel blocks of raw material measurements , 2008 .

[15]  J. Macgregor,et al.  Analysis of multiblock and hierarchical PCA and PLS models , 1998 .

[16]  Tommy Löfstedt,et al.  OnPLS—a novel multiblock method for the modelling of predictive and orthogonal variation , 2011 .

[17]  Mohamed Hanafi,et al.  Analysis of K sets of data, with differential emphasis on agreement between and within sets , 2006, Comput. Stat. Data Anal..

[18]  Olav M. Kvalheim,et al.  History, philosophy and mathematical basis of the latent variable approach – from a peculiarity in psychology to a general method for analysis of multivariate data , 2012 .

[19]  Age K. Smilde,et al.  Real-life metabolomics data analysis : how to deal with complex data ? , 2010 .

[20]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[21]  Johan Trygg,et al.  O2‐PLS, a two‐block (X–Y) latent variable regression (LVR) method with an integral OSC filter , 2003 .

[22]  Pekka Teppola,et al.  Wavelets for scrutinizing multivariate exploratory models— interpreting models through multiresolution analysis , 2001 .

[23]  Tommy Löfstedt,et al.  OnPLS path modelling , 2012 .

[24]  Mohamed Hanafi,et al.  PLS Path modelling: computation of latent variables with the estimation mode B , 2007, Comput. Stat..

[25]  S. Wold,et al.  Orthogonal signal correction of near-infrared spectra , 1998 .

[26]  J. Trygg O2‐PLS for qualitative and quantitative analysis in multivariate calibration , 2002 .

[27]  Timothy M. D. Ebbels,et al.  Intra- and inter-omic fusion of metabolic profiling data in a systems biology framework , 2010 .

[28]  Philippe Casin,et al.  A generalization of principal component analysis to K sets of variables , 2001 .