Array Normal Model and Incomplete Array Variate Observations

Missing data present an important challenge when dealing with high-dimensional data arranged in the form of an array. The main purpose of this article is to introduce methods for estimation of the parameters of array variate normal probability model from partially observed multiway data. The methods developed here are useful for missing data imputation, estimation of mean, and covariance parameters for multiway data. A review of array variate distributions is included. A multiway semi-parametric mixed-effects model that allows separation of multiway mean and covariance effects is also defined, and an efficient algorithm for estimation based on the spectral decompositions of the covariance parameters is recommended. We demonstrate our methods with simulations and real-life data involving the estimation of genotype and environment interaction effects on possibly correlated traits.

[1]  Genevera I. Allen,et al.  TRANSPOSABLE REGULARIZED COVARIANCE MODELS WITH AN APPLICATION TO MISSING DATA IMPUTATION. , 2009, The annals of applied statistics.

[2]  E. Beale,et al.  Missing Values in Multivariate Analysis , 1975 .

[3]  G. P. Frets Heredity of headform in man , 1921, Genetica.

[4]  Steven M. Lalonde,et al.  A First Course in Multivariate Statistics , 1997, Technometrics.

[5]  R. R. Hocking,et al.  The analysis of incomplete data. , 1971 .

[6]  D. Zimmerman,et al.  The likelihood ratio test for a separable covariance matrix , 2005 .

[7]  A. Rukhin Matrix Variate Distributions , 1999, The Multivariate Normal Distribution.

[8]  D. Gianola,et al.  Reproducing Kernel Hilbert Spaces Regression Methods for Genomic Assisted Prediction of Quantitative Traits , 2008, Genetics.

[9]  D. Rubin,et al.  Estimation in Covariance Components Models , 1981 .

[10]  Arjun K. Gupta,et al.  Array Variate Random Variables with Multiway Kro- necker Delta Covariance Matrix Structure , 2011 .

[11]  Peter D. Hoff,et al.  Hierarchical multilinear models for multiway data , 2010, Comput. Stat. Data Anal..

[12]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[13]  Terry Speed,et al.  [That BLUP is a Good Thing: The Estimation of Random Effects]: Comment , 1991 .

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Deniz Akdemir Array Variate Skew Normal Random Variables with Multiway Kronecker Delta Covariance Matrix Structure , 2011 .

[16]  Dietrich von Rosen,et al.  The multilinear normal distribution: Introduction and some basic properties , 2013, J. Multivar. Anal..

[17]  Anuradha Roy,et al.  Likelihood ratio tests for triply multivariate data with structured correlation on spatial repeated measurements , 2008 .

[18]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[19]  Daniel Gianola,et al.  "Likelihood, Bayesian, and Mcmc Methods in Quantitative Genetics" , 2010 .

[20]  G. Wahba,et al.  A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines , 1970 .

[21]  R. L. Quaas,et al.  Multiple Trait Evaluation Using Relatives' Records , 1976 .

[22]  B. Jørgensen,et al.  Efficient estimation for incomplete multivariate data , 2012 .

[23]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[24]  G. Robinson That BLUP is a Good Thing: The Estimation of Random Effects , 1991 .

[25]  T. W. Anderson Maximum Likelihood Estimates for a Multivariate Normal Distribution when Some Observations are Missing , 1957 .

[26]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[27]  M. Srivastava,et al.  Models with a Kronecker product covariance structure: Estimation and testing , 2008 .

[28]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[29]  M. Woodbury A missing information principle: theory and applications , 1972 .

[30]  M. Srivastava,et al.  Estimation in General Multivariate Linear Models with Kronecker Product Covariance Structure , 2008 .

[31]  Georges Blaha A few basic principles and techniques of array algebra , 1977 .

[32]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[33]  R. Bargmann,et al.  MAXIMUM LIKELIHOOD ESTIMATION WITH INCOMPLETE MULTIVARIATE DATA , 1964 .