A calibration method for non-positive definite covariance matrix in multivariate data analysis

Covariance matrices that fail to be positive definite arise often in covariance estimation. Approaches addressing this problem exist, but are not well supported theoretically. In this paper, we propose a unified statistical and numerical matrix calibration, finding the optimal positive definite surrogate in the sense of Frobenius norm. The proposed algorithm can be directly applied to any estimated covariance matrix. Numerical results show that the calibrated matrix is typically closer to the true covariance, while making only limited changes to the original covariance structure.

[1]  Michael G. Kenward,et al.  A Method for Comparing Profiles of Repeated Measurements , 1987 .

[2]  Nicholas J. Higham,et al.  A Preconditioned Newton Algorithm for the Nearest Correlation Matrix , 2010 .

[3]  J. Phair,et al.  The Multicenter AIDS Cohort Study: rationale, organization, and selected characteristics of the participants. , 1987, American journal of epidemiology.

[4]  N. Higham COMPUTING A NEAREST SYMMETRIC POSITIVE SEMIDEFINITE MATRIX , 1988 .

[5]  P. Diggle,et al.  Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. , 1994, Biometrics.

[6]  N. Higham Functions of Matrices: Theory and Computation (Other Titles in Applied Mathematics) , 2008 .

[7]  V. Carey,et al.  Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance , 2003 .

[8]  Christopher J. Zarowski,et al.  An Introduction to Numerical Analysis for Electrical and Computer Engineers: Zarowski/Intro to Numerical Analysis , 2004 .

[9]  P J Diggle,et al.  Nonparametric estimation of covariance structure in longitudinal data. , 1998, Biometrics.

[10]  Martin T. Wells,et al.  Improved second order estimation in the singular multivariate normal model , 2015, J. Multivar. Anal..

[11]  Paul R. Halmos,et al.  Positive Approximants of Operators , 1983 .

[12]  Yo Sheena Modified estimators of the contribution rates of population eigenvalues , 2013, J. Multivar. Anal..

[13]  A. K. Md. Ehsanes Saleh,et al.  A ridge regression estimation approach to the measurement error model , 2014, J. Multivar. Anal..

[14]  M. Pourahmadi Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation , 1999 .

[15]  Yehua Li,et al.  Efficient semiparametric regression for longitudinal data with nonparametric covariance estimation , 2011 .

[16]  Luuk J. Spreeuwers,et al.  A Bootstrap Approach to Eigenvalue Correction , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[17]  Nicholas J. Higham,et al.  Functions of matrices - theory and computation , 2008 .

[18]  P. Diggle An approach to the analysis of repeated measurements. , 1988, Biometrics.

[19]  Hirokazu Yanagihara,et al.  Testing the equality of several covariance matrices with fewer observations than the dimension , 2010, J. Multivar. Anal..

[20]  R. Rebonato,et al.  The Most General Methodology to Create a Valid Correlation Matrix for Risk Management and Option Pricing Purposes , 2011 .

[21]  H. Akaike A new look at the statistical model identification , 1974 .

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  Defeng Sun,et al.  A Quadratically Convergent Newton Method for Computing the Nearest Correlation Matrix , 2006, SIAM J. Matrix Anal. Appl..

[24]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .