Discriminative Mixtures of Sparse Latent Fields for Risk Management

We describe a simple and efficient approach to learning structures of sparse high-dimensional latent variable models. Standard algorithms either learn structures of specific predefined forms, or estimate sparse graphs in the data space ignoring the possibility of the latent variables. In contrast, our method learns rich dependencies and allows for latent variables that may confound the relations between the observations. We extend the model to conditional mixtures with side information and non-Gaussian marginal distributions of the observations. We then show that our model may be used for learning sparse latent variable structures corresponding to multiple unknown states, and for uncovering features useful for explaining and predicting structural changes. We apply the model to real-world financial data with heavy-tailed marginals covering the lowand highmarket volatility periods of 2005-2011. We show that our method tends to give rise to significantly higher likelihoods of test data than standard network learning methods exploiting the sparsity assumption. We also demonstrate that our approach may be practical for financial stress-testing and visualization of dependencies between financial instruments.

[1]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Joshua B. Tenenbaum,et al.  Discovering Structure by Learning Sparse Graphs , 2010 .

[3]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[4]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[5]  Richard Scheines,et al.  Learning the Structure of Linear Latent Variable Models , 2006, J. Mach. Learn. Res..

[6]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[7]  Milos Hauskrecht,et al.  Latent Variable Model for Learning in Pairwise Markov Networks , 2010, AAAI.

[8]  L. Haan,et al.  Residual Life Time at Great Age , 1974 .

[9]  Nir Friedman,et al.  "Ideal Parent" Structure Learning for Continuous Variable Bayesian Networks , 2007, J. Mach. Learn. Res..

[10]  Mark W. Schmidt,et al.  Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[11]  Kim-Chuan Toh,et al.  Solving Log-Determinant Optimization Problems by a Newton-CG Primal Proximal Point Algorithm , 2010, SIAM J. Optim..

[12]  Stephen Gould,et al.  Projected Subgradient Methods for Learning Sparse Gaussians , 2008, UAI.

[13]  Kevin P. Murphy,et al.  Sparse Gaussian graphical models with unknown block structure , 2009, ICML '09.

[14]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[15]  Michael I. Jordan Graphical Models , 1998 .

[16]  Christophe Ambroise,et al.  Inferring sparse Gaussian graphical models with latent structure , 2008, 0810.3177.

[17]  Trevor Hastie,et al.  Applications of the lasso and grouped lasso to the estimation of sparse graphical models , 2010 .

[18]  Charles Kemp,et al.  The discovery of structural form , 2008, Proceedings of the National Academy of Sciences.

[19]  Pierre Giot,et al.  Market Models: A Guide to Financial Data Analysis , 2003 .

[20]  Adam J. Rothman,et al.  Sparse estimation of large covariance matrices via a nested Lasso penalty , 2008, 0803.3872.

[21]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[22]  Harry Leinonen Simulation analyses and stress testing of payment networks , 2009 .

[23]  Kim-Chuan Toh,et al.  SDPT3 — a Matlab software package for semidefinite-quadratic-linear programming, version 3.0 , 2001 .

[24]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[25]  E. Fama The Behavior of Stock-Market Prices , 1965 .

[26]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[27]  A. Willsky,et al.  Latent variable graphical model selection via convex optimization , 2010 .

[28]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[29]  Nevin Lianwen Zhang,et al.  Hierarchical latent class models for cluster analysis , 2002, J. Mach. Learn. Res..

[30]  Zoubin Ghahramani,et al.  Nonparametric Bayesian Sparse Factor Models with application to Gene Expression modelling , 2010, The Annals of Applied Statistics.

[31]  R. Nelsen An Introduction to Copulas , 1998 .

[32]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[33]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[34]  Shiqian Ma,et al.  Sparse Inverse Covariance Selection via Alternating Linearization Methods , 2010, NIPS.

[35]  Peter Bühlmann,et al.  Missing values: sparse inverse covariance estimation and an extension to sparse regression , 2009, Statistics and Computing.

[36]  M. Maathuis,et al.  Estimating high-dimensional intervention effects from observational data , 2008, 0810.4214.

[37]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[38]  Ramesh A. Gopinath,et al.  Gaussianization , 2000, NIPS.

[39]  Christopher K. I. Williams,et al.  Greedy Learning of Binary Latent Trees , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.