Regularized robust estimation of mean and covariance matrix for incomplete data

Abstract This paper considers the robust estimation of the mean and covariance matrix for incomplete multivariate observations with the monotone missing-data pattern. First, we develop two efficient numerical algorithms for the existing robust estimator for the monotone incomplete data, i.e., the maximum likelihood (ML) estimator assuming the samples are from a Student’s t-distribution. The proposed algorithms can be more than one order of magnitude faster than the existing algorithms. Then, to deal with the unreliability and the inapplicability of the Student’s t ML estimator when the number of samples is relatively small compared to the dimension of parameters, we propose a regularized robust estimator, which is defined as the maximizer of a penalized log-likelihood. The penalty term is constructed with a prior target as its global maximizer, towards which the estimator will shrink the mean and covariance matrix. In addition, two numerical algorithms are derived for the regularized estimator. Numerical simulations show the fast convergence rates of the proposed algorithms and the good estimation accuracy of the proposed regularized estimator.

[1]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[2]  Erik G. Larsson,et al.  High-resolution direction finding: the missing data case , 2001, IEEE Trans. Signal Process..

[3]  R. Stambaugh Analyzing Investments Whose Histories Differ in Length , 1997 .

[4]  D. Rubin,et al.  Parameter expansion to accelerate EM: The PX-EM algorithm , 1998 .

[5]  Michael Muma,et al.  Robust Estimation in Signal Processing: A Tutorial-Style Treatment of Fundamental Concepts , 2012, IEEE Signal Processing Magazine.

[6]  Jeremy MG Taylor,et al.  Robust Statistical Modeling Using the t Distribution , 1989 .

[7]  Chuanhai Liu Bartlett's decomposition of the posterior distribution of the covariance for normal monotone ignorable missing data , 1993 .

[8]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[9]  Prabhu Babu,et al.  Regularized Robust Estimation of Mean and Covariance Matrix Under Heavy-Tailed Distributions , 2015, IEEE Transactions on Signal Processing.

[10]  T. Pavlenko,et al.  Estimation of the covariance matrix with two-step monotone missing data , 2016 .

[11]  Visa Koivunen,et al.  Nonlinear filtering techniques for multivariate images - Design and robustness characterization , 1997, Signal Process..

[12]  R. Gramacy,et al.  On estimating covariances between many assets with histories of highly variable length , 2007, 0710.5837.

[13]  Prabhu Babu,et al.  Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning , 2017, IEEE Transactions on Signal Processing.

[14]  P. Saratchandran,et al.  Direction of Arrival (DoA) Estimation Under Array Sensor Failures Using a Minimal Resource Allocation Neural Network , 2007, IEEE Transactions on Antennas and Propagation.

[15]  R. Maronna Robust $M$-Estimators of Multivariate Location and Scatter , 1976 .

[16]  Georgios B. Giannakis,et al.  Estimating high-dimensional covariance matrices with misses for Kronecker product expansion models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  R. Varadhan,et al.  Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm , 2008 .

[18]  Peter Bühlmann,et al.  Pattern alternating maximization algorithm for missing data in high-dimensional problems , 2014, J. Mach. Learn. Res..

[19]  R. Little Robust Estimation of the Mean and Covariance Matrix from Data with Missing Values , 1988 .

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  Petre Stoica,et al.  Enhanced Covariance Matrix Estimators in Adaptive Beamforming , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[22]  Daniel Pérez Palomar,et al.  Performance Analysis and Optimal Selection of Large Minimum Variance Portfolios Under Estimation Risk , 2011, IEEE Journal of Selected Topics in Signal Processing.

[23]  M. Dacorogna,et al.  Heavy Tails in High-Frequency Financial Data , 1998 .

[24]  D. Rubin,et al.  ML ESTIMATION OF THE t DISTRIBUTION USING EM AND ITS EXTENSIONS, ECM AND ECME , 1999 .

[25]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[26]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[27]  A. Hero,et al.  Robust shrinkage estimation of high-dimensional covariance matrices , 2010 .

[28]  Ami Wiesel,et al.  Unified Framework to Regularized Covariance Estimation in Scaled Gaussian Models , 2012, IEEE Transactions on Signal Processing.

[29]  Chuanhai Liu Efficient ML Estimation of the Multivariate Normal Distribution from Incomplete Data , 1999 .