Modeling Multivariate Spatial-Temporal Data with Latent Low-Dimensional Dynamics

High-dimensional multivariate spatial-temporal data arise frequently in a wide range of applications; however, there are relatively few statistical methods that can simultaneously deal with spatial, temporal and variable-wise dependencies in large data sets. In this paper, we propose a new approach to utilize the correlations in variable, space and time to achieve dimension reduction and to facilitate spatial/temporal predictions in the high-dimensional settings. The multivariate spatial-temporal process is represented as a linear transformation of a lower-dimensional latent factor process. The spatial dependence structure of the factor process is further represented non-parametrically in terms of latent empirical orthogonal functions. The low-dimensional structure is completely unknown in our setting and is learned entirely from data collected irregularly over space but regularly over time. We propose innovative estimation and prediction methods based on the latent low-rank structures. Asymptotic properties of the estimators and predictors are established. Extensive experiments on synthetic and real data sets show that, while the dimensions are reduced significantly, the spatial, temporal and variable-wise covariance structures are largely preserved. The efficacy of our method is further confirmed by the prediction performances on both synthetic and real data sets.

[1]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[2]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[3]  L. Schumaker Spline Functions: Basic Theory , 1981 .

[4]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[5]  Noel A Cressie,et al.  Some Dynamic Graphics for Spatial Data (with Multiple Attributes) in a GIS , 1994 .

[6]  N. Cressie,et al.  Spatio-temporal prediction of snow water equivalent using the Kalman filter , 1996 .

[7]  Noel A Cressie,et al.  Dynamic graphics for exploring spatial dependence in multivariate spatial data , 1997 .

[8]  N. Cressie,et al.  A dimension-reduced approach to space-time Kalman filtering , 1999 .

[9]  H. Storch,et al.  Statistical Analysis in Climate Research , 2000 .

[10]  J. Bai,et al.  Determining the Number of Factors in Approximate Factor Models , 2000 .

[11]  Yasuo Amemiya,et al.  Generalized Shifted-Factor Analysis Method for Multivariate Geo-Referenced Data , 2001 .

[12]  Yasuo Amemiya,et al.  Latent Variable Analysis of Multivariate Spatial Data , 2002 .

[13]  Bradley P. Carlin,et al.  Hierarchical Multivarite CAR Models for Spatio-Temporally Correlated Survival Data , 2002 .

[14]  D. Higdon Space and Space-Time Modeling using Process Convolutions , 2002 .

[15]  Anthony N. Pettitt,et al.  A Conditional Autoregressive Gaussian Process for Irregularly Spaced Multivariate Data with Application to Modelling Large Sets of Binary Data , 2002, Stat. Comput..

[16]  J. Bai,et al.  Inferential Theory for Factor Models of Large Dimensions , 2003 .

[17]  Yasuo Amemiya,et al.  Modeling and prediction for multivariate spatial factor analysis , 2003 .

[18]  M. Wand,et al.  Geoadditive models , 2003 .

[19]  Peter Congdon A Multivariate Model for Spatio-temporal Health Outcomes with an Application to Suicide Mortality , 2004 .

[20]  J. Merikoski,et al.  Inequalities for spreads of matrix sums and products. , 2004 .

[21]  M. Stein Space–Time Covariance Functions , 2005 .

[22]  J. Zhu,et al.  Generalized Linear Latent Variable Models for Repeated Measures of Spatially Correlated Multivariate Data , 2005, Biometrics.

[23]  M. Daniels,et al.  Conditionally Specified Space-Time Models for Multivariate Processes , 2006 .

[24]  W. Briggs Statistical Methods in the Atmospheric Sciences , 2007 .

[25]  Ian T. Jolliffe,et al.  Empirical orthogonal functions and related techniques in atmospheric science: A review , 2007 .

[26]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[27]  N. Best,et al.  Bayesian latent variable modelling of multivariate spatio-temporal variation in cancer mortality , 2008, Statistical methods in medical research.

[28]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[29]  D. Gamerman,et al.  Spatial dynamic factor analysis , 2008 .

[30]  Yan Liu,et al.  Spatial-temporal causal modeling for climate change attribution , 2009, KDD.

[31]  Andrew O. Finley,et al.  Improving the performance of predictive process modeling for large datasets , 2009, Comput. Stat. Data Anal..

[32]  G. North,et al.  Empirical Orthogonal Functions: The Medium is the Message , 2009 .

[33]  Christopher K. Wikle,et al.  Low-Rank Representations for Spatial Processes , 2010 .

[34]  Clifford Lam,et al.  Estimation of latent factors for high-dimensional time series , 2011 .

[35]  Noel A Cressie,et al.  Statistics for Spatio-Temporal Data , 2011 .

[36]  Clifford Lam,et al.  Factor modeling for high-dimensional time series: inference for the number of factors , 2012, 1206.0613.

[37]  M. Genton,et al.  Cross-Covariance Functions for Multivariate Geostatistics 1 , 2015 .

[38]  Jonathan R. Bradley,et al.  Multivariate spatio-temporal models for high-dimensional areal data with application to Longitudinal Employer-Household Dynamics , 2015, 1503.00982.

[39]  Q. Yao,et al.  High dimensional stochastic regression with latent factors, endogeneity and nonlinearity , 2013, 1310.1990.

[40]  D. Allard,et al.  A Flexible Class of Non-separable Cross-Covariance Functions for Multivariate Space-Time Data , 2015, 1510.07840.

[41]  Marc G. Genton,et al.  Cross-Covariance Functions for Multivariate Geostatistics , 2015, 1507.08017.

[42]  Luca Vogt Statistics For Spatial Data , 2016 .

[43]  Rongmao Zhang,et al.  Krigings over space and time based on latent low-dimensional structures , 2016, Science China Mathematics.

[44]  Jianqing Fan,et al.  PROJECTED PRINCIPAL COMPONENT ANALYSIS IN FACTOR MODELS. , 2014, Annals of statistics.

[45]  S. Cagnone,et al.  Generalized linear latent variable models for the analysis of cognitive functioning over time , 2017 .

[46]  Rong Chen,et al.  Autoregressive models for matrix-valued time series , 2018, Journal of Econometrics.

[47]  Hsin-Cheng Huang,et al.  Resolution Adaptive Fixed Rank Kriging , 2018, Technometrics.

[48]  Rong Chen,et al.  Factor models for matrix-valued high-dimensional time series , 2016, Journal of Econometrics.

[49]  Elynn Y. Chen,et al.  Constrained Factor Models for High-Dimensional Matrix-Variate Time Series , 2017, Journal of the American Statistical Association.

[50]  Jianqing Fan,et al.  Robust high dimensional factor models with applications to statistical machine learning. , 2018, Statistical science : a review journal of the Institute of Mathematical Statistics.