Unsupervised multiblock data analysis: A unified approach and extensions

Abstract For the analysis of multiblock data, a unified approach of several strategies such as Generalized Canonical Correlation Analysis (GCCA), Multiblock Principal Components Analysis (MB-PCA), Hierarchical Principal Components Analysis (H-PCA) and ComDim is outlined. These methods are based on the determination of global and block components. The unified approach postulates, on the one hand, two link functions that relate the block components to their associated global components and, on the other hand, two summing up expressions to compute the global components from their associated block components. Not only several well-known methods are retrieved but we also introduce a variant of GCCA. More generally, we hint to other possibilities of extensions thus emphasizing the fact that the unified approach, besides being simple, is versatile. We also show how this approach of analysis although basically unsupervised could be adapted to yield a supervised method to be used for a prediction purpose. Illustrations on the basis of simulated and real case studies are discussed.

[1]  J. Macgregor,et al.  Analysis of multiblock and hierarchical PCA and PLS models , 1998 .

[2]  El Mostafa Qannari,et al.  Chemometric methods for the coupling of spectroscopic techniques and for the extraction of the relevant information contained in the spectral data tables , 2002 .

[3]  El Mostafa Qannari,et al.  Defining the underlying sensory dimensions , 2000 .

[4]  Alain Baccini,et al.  CCA: An R Package to Extend Canonical Correlation Analysis , 2008 .

[5]  L. E. Wangen,et al.  A multiblock partial least squares algorithm for investigating complex chemical systems , 1989 .

[6]  A. Tenenhaus,et al.  Regularized Generalized Canonical Correlation Analysis , 2011, Eur. J. Oper. Res..

[7]  Dominique Bertrand,et al.  Common components and specific weights analysis: A chemometric method for dealing with complexity of food products , 2006 .

[8]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[9]  H. Vinod Canonical ridge and econometrics of joint production , 1976 .

[10]  Johan Trygg,et al.  The PLS method -- partial least squares projections to latent structures -- and its applications in industrial RDP (research, development, and production) , 2004 .

[11]  Bhupinder S. Dayal,et al.  Improved PLS algorithms , 1997 .

[12]  El Mostafa Qannari,et al.  A simple continuum regression approach , 2005 .

[13]  S. Wold,et al.  The multivariate calibration problem in chemistry solved by the PLS method , 1983 .

[14]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[15]  Mostafa El Qannari,et al.  Continuum redundancy-PLS regression: A simple continuum approach , 2008, Comput. Stat. Data Anal..

[16]  Tormod Næs,et al.  Chemometrics in foodomics: Handling data structures from multiple analytical platforms , 2014 .

[17]  Evelyne Vigneau,et al.  Common components and specific weights analysis performed on preference data , 2001 .

[18]  A. Smilde,et al.  Deflation in multiblock PLS , 2001 .

[19]  Evelyne Vigneau,et al.  Latent root regression analysis: an alternative method to PLS , 2001 .

[20]  S. Engelsen,et al.  Prediction of Sensory Texture of Cooked Potatoes using Uniaxial Compression, Near Infrared Spectroscopy and Low Field1H NMR Spectroscopy , 2000 .

[21]  Svante Wold,et al.  Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection , 1996 .

[22]  J. M. Martin,et al.  Relationships between sensory descriptors, consumer acceptability and volatile flavor compounds of American dry-cured ham. , 2008, Meat science.

[23]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[24]  S. de Jong,et al.  A framework for sequential multiblock component methods , 2003 .

[25]  J. T. Webster,et al.  Latent Root Regression Analysis , 1974 .

[26]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[27]  Mostafa El Qannari,et al.  Multiblock latent root regression. Application to epidemiological data , 2007, Comput. Stat..

[28]  Dominique Bertrand,et al.  Study of NIR Spectra, Particle Size Distributions and Chemical Parameters of Wheat Flours: A Multi-Way Approach , 2001 .

[29]  Howard R. Moskowitz,et al.  Sensory and Consumer Research in Food Product Design and Development , 2006 .

[30]  Mostafa El Qannari,et al.  A new algorithm for latent root regression analysis , 2002, Comput. Stat. Data Anal..

[31]  El Mostafa Qannari,et al.  Common components and specific weight analysis and multiple co‐inertia analysis applied to the coupling of several measurement techniques , 2006 .

[32]  El Mostafa Qannari,et al.  Shedding new light on Hierarchical Principal Component Analysis , 2010 .