An ExPosition of multivariate analysis with the singular value decomposition in R

ExPosition is a new comprehensive R package providing crisp graphics and implementing multivariate analysis methods based on the singular value decomposition (svd). The core techniques implemented in ExPosition are: principal components analysis, (metric) multidimensional scaling, correspondence analysis, and several of their recent extensions such as barycentric discriminant analyses (e.g., discriminant correspondence analysis), multi-table analyses (e.g.,multiple factor analysis, Statis, and distatis), and non-parametric resampling techniques (e.g., permutation and bootstrap). Several examples highlight the major differences between ExPosition and similar packages. Finally, the future directions of ExPosition are discussed.

[1]  Lars Kai Hansen,et al.  The Quantitative Evaluation of Functional Neuroimaging Experiments: The NPAIRS Data Analysis Framework , 2000, NeuroImage.

[2]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[3]  J. Dunlop,et al.  Effect of Age on Variability in the Production of Text-Based Global Inferences , 2012, PloS one.

[4]  Anjan Chatterjee,et al.  Deconstructing Events: The Neural Bases for Space, Time, and Causality , 2012, Journal of Cognitive Neuroscience.

[5]  Giorgio Russolillo,et al.  Partial least squares algorithms and methods , 2013 .

[6]  J. Devries About z-scores , 2007 .

[7]  Michael Greenacre,et al.  A Comparison of Different Methods for Representing Categorical Data , 2006 .

[8]  Anthony Randal McIntosh,et al.  Partial least squares analysis of neuroimaging data: applications and advances , 2004, NeuroImage.

[9]  J. V. Haxby,et al.  Spatial Pattern Analysis of Functional Brain Images Using Partial Least Squares , 1996, NeuroImage.

[10]  柳井 晴夫,et al.  Projection matrices, generalized inverse matrices, and singular value decomposition , 2011 .

[11]  M. Hill Correspondence Analysis: A Neglected Multivariate Method , 1974 .

[12]  H. Abdi,et al.  Integrating Partial Least Squares Correlation and Correspondence Analysis for Nominal Data , 2013 .

[13]  Hervé Abdi,et al.  The Neural Basis of Vivid Memory Is Patterned on Perception , 2012, Journal of Cognitive Neuroscience.

[14]  Juan Carlos Gomez,et al.  PCA document reconstruction for email classification , 2012, Comput. Stat. Data Anal..

[15]  M. Greenacre Correspondence analysis in practice , 1993 .

[16]  Conrad Sanderson,et al.  RcppArmadillo: Accelerating R with high-performance C++ linear algebra , 2014, Comput. Stat. Data Anal..

[17]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[18]  C. R. Rao,et al.  An Alternative to Correspondence Analysis Using Hellinger Distance. , 1997 .

[19]  Alice J. O'Toole,et al.  DISTATIS: The Analysis of Multiple Distance Matrices , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[20]  Antonio Moreno,et al.  Significant correlation between a set of genetic polymorphisms and a functional brain network revealed by feature selection and sparse Partial Least Squares , 2012, NeuroImage.

[21]  Heungsun Hwang,et al.  An Improved Method for Generalized Constrained Canonical Correlation Analysis , 2002, Comput. Stat. Data Anal..

[22]  Gilbert Saporta,et al.  L'analyse des données , 1981 .

[24]  J. Gower Adding a point to vector diagrams in multivariate analysis , 1968 .

[25]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[26]  Hervé Abdi,et al.  Analysis of regional cerebral blood flow data to discriminate among Alzheimer's disease, frontotemporal dementia, and elderly controls: a multi-block barycentric discriminant analysis (MUBADA) methodology. , 2012, Journal of Alzheimer's disease : JAD.

[27]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[28]  Hervé Abdi,et al.  A Tutorial on Multiblock Discriminant Correspondence Analysis ( MUDICA ) : A New Method for Analyzing Discourse Data From Clinical Populations , 2010 .

[29]  Robert Sabatier,et al.  The ACT (STATIS method) , 1994 .

[30]  Murat M. Tanik,et al.  An overview of statistical decomposition techniques applied to complex systems , 2008, Comput. Stat. Data Anal..

[31]  Sébastien Lê,et al.  FactoMineR: An R Package for Multivariate Analysis , 2008 .

[32]  Hervé Abdi,et al.  Singular Value Decomposition ( SVD ) and Generalized Singular Value Decomposition ( GSVD ) , 2006 .

[33]  L. Tucker An inter-battery method of factor analysis , 1958 .

[34]  Hervé Abdi,et al.  How to compute reliability estimates and display confidence and tolerance intervals for pattern classifiers using the Bootstrap and 3-way multidimensional scaling (DISTATIS) , 2009, NeuroImage.

[35]  H. Abdi,et al.  Qualitatively Distinct Factors Contribute to Elevated Rates of Paranoia in Autism and Schizophrenia , 2022 .

[36]  Michel Tenenhaus,et al.  PLS path modeling , 2005, Comput. Stat. Data Anal..

[37]  Michael Greenacre,et al.  Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca Package , 2007 .

[38]  P. Duncombe,et al.  Multivariate Descriptive Statistical Analysis: Correspondence Analysis and Related Techniques for Large Matrices , 1985 .

[39]  Neil Salkind,et al.  Encyclopedia of research design , 2010 .

[40]  Hervé Abdi,et al.  STATIS and DISTATIS: optimum multitable principal component analysis and three way metric multidimensional scaling , 2012 .

[41]  Wynne W. Chin,et al.  New Perspectives in Partial Least Squares and Related Methods , 2013 .

[42]  Hervé Abdi,et al.  Correspondence Analysis , 2014, Encyclopedia of Social Network Analysis and Mining.

[43]  Julie Josse,et al.  Selecting the number of components in principal component analysis using cross-validation approximations , 2012, Comput. Stat. Data Anal..

[44]  H. Abdi Partial least squares regression and projection on latent structure regression (PLS Regression) , 2010 .

[45]  Hervé Abdi,et al.  Multiple Subject Barycentric Discriminant Analysis (MUSUBADA): How to Assign Scans to Categories without Using Spatial Normalization , 2012, Comput. Math. Methods Medicine.

[46]  Joseph L. Zinnes,et al.  Theory and Methods of Scaling. , 1958 .

[47]  Anne-Béatrice Dufour,et al.  The ade4 Package: Implementing the Duality Diagram for Ecologists , 2007 .

[48]  Jérôme Pagès,et al.  Multiple factor analysis and clustering of a mixture of quantitative, categorical and frequency data , 2008, Comput. Stat. Data Anal..

[49]  H. Abdi,et al.  Multiple factor analysis: principal component analysis for multitable and multiblock data sets , 2013 .

[50]  J. P. Benzécri,et al.  Sur le calcul des taux d'inertie dans l'analyse d'un questionnaire, addendum et erratum à [BIN. MULT.] , 1979 .

[51]  Cheryl L. Grady,et al.  Influence of Aging on the Neural Correlates of Autobiographical, Episodic, and Semantic Memory Retrieval , 2011, Journal of Cognitive Neuroscience.

[52]  Brigitte Escofier,et al.  Analyse factorielle et distances répondant au principe d'équivalence distributionnelle , 1978 .

[53]  Michael R Chernick,et al.  Bootstrap Methods: A Guide for Practitioners and Researchers , 2007 .

[54]  Florent Baty,et al.  Analysis with respect to instrumental variables for the exploration of microarray data structures , 2006, BMC Bioinformatics.

[55]  Faming Liang,et al.  Use of SVD-based probit transformation in clustering gene expression profiles , 2007, Comput. Stat. Data Anal..

[56]  A. McIntosh,et al.  Multivariate statistical analyses for neuroimaging data. , 2013, Annual review of psychology.

[57]  Donald A. Jackson,et al.  How many principal components? stopping rules for determining the number of non-trivial axes revisited , 2005, Comput. Stat. Data Anal..

[58]  Jean Thioulouse,et al.  Simultaneous analysis of a sequence of paired ecological tables: A comparison of several methods , 2011, 1202.5473.

[59]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[60]  Anthony Randal McIntosh,et al.  Partial Least Squares (PLS) methods for neuroimaging: A tutorial and review , 2011, NeuroImage.

[61]  Timoteo Carletti,et al.  The Stochastic Evolution of a Protocell: The Gillespie Algorithm in a Dynamically Varying Volume , 2011, Comput. Math. Methods Medicine.

[62]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[63]  F. Cailliez,et al.  Introduction à l'analyse des données , 1976 .

[64]  L. K. Hansen,et al.  The Quantitative Evaluation of Functional Neuroimaging Experiments: The NPAIRS Data Analysis Framework , 2000, NeuroImage.

[65]  John C. Castura,et al.  Existing and new approaches for the analysis of CATA data , 2013 .

[66]  Yves Escoufier,et al.  Operator related to a data matrix: a survey , 2006 .

[67]  Stéphane Dray,et al.  On the number of principal components: A test of dimensionality based on measurements of similarity between matrices , 2008, Comput. Stat. Data Anal..