Sparse Gaussian Conditional Random Fields: Algorithms, Theory, and Application to Energy Forecasting

This paper considers the sparse Gaussian conditional random field, a discriminative extension of sparse inverse covariance estimation, where we use convex methods to learn a high-dimensional conditional distribution of outputs given inputs. The model has been proposed by multiple researchers within the past year, yet previous papers have been substantially limited in their analysis of the method and in the ability to solve large-scale problems. In this paper, we make three contributions: 1) we develop a second-order active-set method which is several orders of magnitude faster than previously proposed optimization approaches for this problem, 2) we analyze the model from a theoretical standpoint, improving upon past bounds with convergence rates that depend logarithmically on the data dimension, and 3) we apply the method to large-scale energy forecasting problems, demonstrating state-of-the-art performance on two real-world tasks.

[1]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[2]  Pradeep Ravikumar,et al.  Sparse inverse covariance matrix estimation using quadratic approximation , 2011, MLSLP.

[3]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[4]  Stephen Gould,et al.  Projected Subgradient Methods for Learning Sparse Gaussians , 2008, UAI.

[5]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[6]  Ahmad M. Al-Kandari,et al.  3 – Load Modeling for Short-Term Forecasting , 2010 .

[7]  Shiqian Ma,et al.  Sparse Inverse Covariance Selection via Alternating Linearization Methods , 2010, NIPS.

[8]  Kyung-Ah Sohn,et al.  Joint Estimation of Structured Sparsity and Output Structure in Multiple-Output Regression via Inverse-Covariance Regularization , 2012, AISTATS.

[9]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[10]  Jorge Nocedal,et al.  Newton-Like Methods for Sparse Inverse Covariance Estimation , 2012, NIPS.

[11]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[12]  P. Bartlett,et al.  ℓ1-regularized linear regression: persistence and oracle inequalities , 2012 .

[13]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[14]  Zhaosong Lu,et al.  Smooth Optimization Approach for Sparse Covariance Selection , 2008, SIAM J. Optim..

[15]  Pierre Pinson,et al.  Global Energy Forecasting Competition 2012 , 2014 .

[16]  S. Sra,et al.  Matrix Differential Calculus , 2005 .

[17]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[18]  Xiao-Tong Yuan,et al.  Partial Gaussian Graphical Model Estimation , 2012, IEEE Transactions on Information Theory.

[19]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.