A Generalized Stochastic Variational Bayesian Hyperparameter Learning Framework for Sparse Spectrum Gaussian Process Regression

While much research effort has been dedicated to scaling up sparse Gaussian process (GP) models based on inducing variables for big data, little attention is afforded to the other less explored class of low-rank GP approximations that exploit the sparse spectral representation of a GP kernel. This paper presents such an effort to advance the state of the art of sparse spectrum GP models to achieve competitive predictive performance for massive datasets. Our generalized framework of stochastic variational Bayesian sparse spectrum GP (sVBSSGP) models addresses their shortcomings by adopting a Bayesian treatment of the spectral frequencies to avoid overfitting, modeling these frequencies jointly in its variational distribution to enable their interaction a posteriori, and exploiting local data for boosting the predictive performance. However, such structural improvements result in a variational lower bound that is intractable to be optimized. To resolve this, we exploit a variational parameterization trick to make it amenable to stochastic optimization. Interestingly, the resulting stochastic gradient has a linearly decomposable structure that can be exploited to refine our stochastic optimization method to incur constant time per iteration while preserving its property of being an unbiased estimator of the exact gradient of the variational lower bound. Empirical evaluation on real-world datasets shows that sVBSSGP outperforms state-of-the-art stochastic implementations of sparse GP models.

[1]  Mohan S. Kankanhalli,et al.  Active Learning Is Planning: Nonmyopic ε-Bayes-Optimal Active Learning of Gaussian Processes , 2014, ECML/PKDD.

[2]  Richard E. Turner,et al.  Improving the Gaussian Process Sparse Spectrum Approximation by Representing Uncertainty in Frequency Inputs , 2015, ICML.

[3]  Gaurav S. Sukhatme,et al.  Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena , 2012, UAI.

[4]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[5]  Kian Hsiang Low,et al.  Generalized Online Sparse Gaussian Processes with Application to Persistent Mobile Robot Localization , 2014, ECML/PKDD.

[6]  Kian Hsiang Low,et al.  Gaussian Process Decentralized Data Fusion and Active Sensing for Spatiotemporal Traffic Modeling and Prediction in Mobility-on-Demand Systems , 2015, IEEE Transactions on Automation Science and Engineering.

[7]  Mohan S. Kankanhalli,et al.  Near-Optimal Active Learning of Multi-Output Gaussian Processes , 2015, AAAI.

[8]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[9]  Kian Hsiang Low,et al.  A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models , 2016, ICML.

[10]  Carl E. Rasmussen,et al.  Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[11]  Kian Hsiang Low,et al.  Active Markov information-theoretic path planning for robotic environmental sensing , 2011, AAMAS.

[12]  Kian Hsiang Low,et al.  GP-Localize: Persistent Mobile Robot Localization using Online Sparse Gaussian Process Observation Model , 2014, AAAI.

[13]  Kian Hsiang Low,et al.  Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation , 2014, AAAI.

[14]  Kian Hsiang Low,et al.  Adaptive multi-robot wide-area exploration and mapping , 2008, AAMAS.

[15]  Alberto Elfes,et al.  Cooperative aquatic sensing using the telesupervised adaptive ocean sensor fleet , 2009, Remote Sensing.

[16]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[17]  Mohan S. Kankanhalli,et al.  Nonmyopic \(\epsilon\)-Bayes-Optimal Active Learning of Gaussian Processes , 2014, ICML.

[18]  Kian Hsiang Low,et al.  A Unifying Framework of Anytime Sparse Gaussian Process Regression Models with Stochastic Variational Inference for Big Data , 2015, ICML.

[19]  Krisztian Buza,et al.  Feedback Prediction for Blogs , 2012, GfKl.

[20]  Kian Hsiang Low,et al.  Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond , 2015, AAAI.

[21]  Kian Hsiang Low,et al.  Multi-robot informative path planning for active sensing of environmental phenomena: a tale of two algorithms , 2013, AAMAS.

[22]  Kian Hsiang Low,et al.  Telesupervised remote surface water quality sensing , 2010, 2010 IEEE Aerospace Conference.

[23]  Ming-Deh A. Huang,et al.  Proof of proposition 1 , 1992 .

[24]  Kian Hsiang Low,et al.  Information-Theoretic Approach to Efficient Adaptive Path Planning for Mobile Robotic Environmental Sensing , 2009, ICAPS.

[25]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[26]  Kian Hsiang Low,et al.  Hierarchical Bayesian Nonparametric Approach to Modeling and Learning the Wisdom of Crowds of Urban Traffic Route Planning Agents , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[27]  Kian Hsiang Low,et al.  Decentralized active robotic exploration and mapping for probabilistic field classification in environmental sensing , 2012, AAMAS.

[28]  Kian Hsiang Low,et al.  Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations , 2013, UAI.

[29]  Kian Hsiang Low,et al.  Gaussian Process-Based Decentralized Data Fusion and Active Sensing for Mobility-on-Demand System , 2013, Robotics: Science and Systems.

[30]  Kian Hsiang Low,et al.  Recent Advances in Scaling Up Gaussian Process Predictive Models for Large Spatiotemporal Data , 2014, DyDESS.