论文信息 - Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes - 字舞流文

Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes

Automating statistical modelling is a challenging problem in artificial intelligence. The Automatic Statistician takes a first step in this direction, by employing a kernel search algorithm with Gaussian Processes (GP) to provide interpretable statistical models for regression problems. However this does not scale due to its $O(N^3)$ running time for the model selection. We propose Scalable Kernel Composition (SKC), a scalable kernel search algorithm that extends the Automatic Statistician to bigger data sets. In doing so, we derive a cheap upper bound on the GP marginal likelihood that sandwiches the marginal likelihood with the variational lower bound . We show that the upper bound is significantly tighter than the lower bound and thus useful for model selection.

Yee Whye Teh | Hyunjik Kim | Y. Teh | Hyunjik Kim

[1] Andrew Gordon Wilson,et al. Thoughts on Massively Scalable Gaussian Processes , 2015, ArXiv.

[2] I-Cheng Yeh,et al. Modeling of strength of high-performance concrete using artificial neural networks , 1998 .

[3] Rémi Bardenet,et al. Inference for determinantal point processes without spectral knowledge , 2015, NIPS.

[4] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[5] Roman Garnett,et al. Discovering and Exploiting Additive Structure for Bayesian Optimization , 2017, AISTATS.

[6] M. West,et al. Bounded Approximations for Marginal Likelihoods , 2010 .

[7] Carl E. Rasmussen,et al. A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[8] Zoubin Ghahramani,et al. Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[9] Neil D. Lawrence,et al. Gaussian Processes for Big Data , 2013, UAI.

[10] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[11] Sergio Escalera,et al. A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention , 2016, AutoML@ICML.

[12] Arno Solin,et al. Explicit Link Between Periodic Covariance Functions and State Space Models , 2014, AISTATS.

[13] Michael A. Osborne,et al. Preconditioning Kernel Matrices , 2016, ICML.

[14] W. R. Hunt,et al. An Introduction to Biology , 1942, The Yale Journal of Biology and Medicine.

[15] Michael I. Jordan,et al. Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[16] Aki Vehtari,et al. GPstuff: Bayesian modeling with Gaussian processes , 2013, J. Mach. Learn. Res..

[17] Roman Garnett,et al. Bayesian optimization for automated model selection , 2016, NIPS.

[18] Adrian E. Raftery,et al. Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering , 2007, J. Classif..

[19] Petros Drineas,et al. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[20] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21] J. Lean,et al. Reconstruction of solar irradiance since 1610: Implications for climate change , 1995 .

[22] Joshua B. Tenenbaum,et al. Automatic Construction and Natural-Language Description of Nonparametric Regression Models , 2014, AAAI.

[23] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[24] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[25] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[26] Andrew Gordon Wilson,et al. Gaussian Process Kernels for Pattern Discovery and Extrapolation , 2013, ICML.

[27] James Hensman,et al. On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes , 2015, AISTATS.

[28] Lorenzo Rosasco,et al. Less is More: Nyström Computational Regularization , 2015, NIPS.

[29] Barnabás Póczos,et al. Bayesian Nonparametric Kernel-Learning , 2015, AISTATS.

[30] Joshua B. Tenenbaum,et al. Structure Discovery in Nonparametric Regression through Compositional Kernel Search , 2013, ICML.

[31] Michalis K. Titsias,et al. Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[32] Ameet Talwalkar,et al. On the Impact of Kernel Approximation on Learning Accuracy , 2010, AISTATS.

[33] Andrew Gordon Wilson,et al. Deep Kernel Learning , 2015, AISTATS.

[34] Joshua B. Tenenbaum,et al. Exploiting compositionality to explore a large space of model structures , 2012, UAI.

[35] Matthew J. Beal. Variational algorithms for approximate Bayesian inference , 2003 .

[36] Matthias W. Seeger,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[37] Carl E. Rasmussen,et al. Understanding Probabilistic Sparse Gaussian Process Approximations , 2016, NIPS.

[38] Neil D. Lawrence,et al. Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[39] Yves-Laurent Kom Samo,et al. Generalized Spectral Kernels , 2015, 1506.02236.

[40] Pınar Tüfekci,et al. Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods , 2014 .

[41] W. Rudin,et al. Fourier Analysis on Groups. , 1965 .