Parsimonious parameterization of correlation matrices using truncated vines and factor analysis

Both in classical multivariate analysis and in modern copula modeling, correlation matrices are a central concept of dependence modeling using multivariate normal distributions and copulas. Since the number of correlation parameters quadratically increases with the number of variables, parsimonious parameterizations of large correlation matrices in terms of O(d) parameters are important. While factor analysis is commonly used for this purpose, the use of vines is an attractive alternative: vines are graphical models based on a sequence of trees, and are based on the decomposition of a correlation matrix in terms of algebraically independent correlations and partial correlations. By limiting the number of trees, with the so-called truncation, parsimonious parameterizations of correlation matrices may be found. Moreover, truncated vines and factor models may be joined to define a combined model, with individual benefits from each of the two approaches. The different parameterizations and how they are estimated for data are discussed. In particular, spanning tree algorithms for truncated vines and a modified EM algorithm for the combined factor–vine model are proposed and evaluated in a simulation study. Three applications to psychometric and finance data sets illustrate the different parsimonious models.

[1]  Harry Joe,et al.  Truncation of vine copulas using fit indices , 2015, J. Multivar. Anal..

[2]  Lutz F. Gruber,et al.  Sequential Bayesian Model Selection of Regular Vine Copulas , 2015, 1512.00976.

[3]  A. Basilevsky,et al.  Factor Analysis as a Statistical Method. , 1964 .

[4]  Dorota Kurowicka,et al.  Generating random correlation matrices based on vines and extended onion method , 2009, J. Multivar. Anal..

[5]  J. S. Long,et al.  Tests for Structural Equation Models , 1992 .

[6]  R. Prim Shortest connection networks and some generalizations , 1957 .

[7]  R. Cooke,et al.  Completion problem with partial correlation vines , 2006 .

[8]  Kenneth Lange,et al.  Numerical analysis for statisticians , 1999 .

[9]  Pavel Krupskii,et al.  Factor copula models for multivariate data , 2013, J. Multivar. Anal..

[10]  Francis Tuerlinckx,et al.  Copula Functions for Residual Dependency , 2007 .

[11]  E. B. Andersen,et al.  Modern factor analysis , 1961 .

[12]  S. Kullback,et al.  On Testing Correlation Matrices , 1967 .

[13]  James H. Steiger,et al.  Driving Fast In Reverse The Relationship Between Software Development, Theory, and Education in Structural Equation Modeling , 2001 .

[14]  C. Reisinger,et al.  Modelling of dependence in high-dimensional financial time series by cluster-derived canonical vines , 2014, 1411.4970.

[15]  R. Kass,et al.  Shrinkage Estimators for Covariance Matrices , 2001, Biometrics.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[18]  J. C. Gower,et al.  Factor Analysis as a Statistical Method. 2nd ed. , 1972 .

[19]  S. Poon,et al.  Financial Modeling Under Non-Gaussian Distributions , 2006 .

[20]  Dorota Kurowicka,et al.  Dependence Modeling: Vine Copula Handbook , 2010 .

[21]  Andreas Ritter,et al.  Structural Equations With Latent Variables , 2016 .

[22]  C. Czado,et al.  Truncated regular vines in high dimensions with application to financial data , 2012 .

[23]  A. E. Maxwell,et al.  Factor Analysis as a Statistical Method. , 1964 .

[24]  R. Cooke,et al.  A parameterization of positive definite matrices in terms of partial correlation vines , 2003 .

[25]  Jessika Weiss,et al.  Graphical Models In Applied Multivariate Statistics , 2016 .

[26]  Claudia Czado,et al.  Selecting and estimating regular vine copulae and application to financial returns , 2012, Comput. Stat. Data Anal..

[27]  Warren G. Findley,et al.  The Theory of Multiple Factors. , 1933 .

[28]  Dorota Kurowicka,et al.  Optimal Truncation of Vines , 2010 .

[29]  James H. Steiger,et al.  Driving Fast in Reverse , 2001 .

[30]  J. Geweke,et al.  Measuring the pricing error of the arbitrage pricing theory , 1996 .

[31]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[32]  T. Bedford,et al.  Vines: A new graphical model for dependent random variables , 2002 .

[33]  Harold N. Gabow,et al.  Two Algorithms for Generating Weighted Spanning Trees in Order , 1977, SIAM J. Comput..