PAC-Bayes Objectives for Meta-Learning using Deep Probabilistic Programs

Recent approaches to meta-learning have used hierarchical PAC-Bayes bounds to transfer information between tasks via a common hyper-posterior. Single-level PAC-Bayes bounds can likewise be used as transfer learning objectives, by using a prior learned on one task to constrain a posterior on another. However, existing methods adopting these approaches place restrictions on the form of the (hyper-) priors and/or posteriors used. We show how general and tractable PAC-Bayes bounds can be derived in a deep probabilistic programming (DPP) framework and used for transferand meta-learning tasks. This allows both prior and posterior to be arbitrary DPPs, hyper-priors to be easily introduced, and variational techniques to be used during optimization. We test our framework using learning tasks defined on synthetic and biological data.

[1]  Dmitry Vetrov,et al.  Importance Weighted Hierarchical Variational Inference , 2019, NeurIPS.

[2]  Ryan P. Adams,et al.  Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach , 2018, ICLR.

[3]  Prashant S. Emani,et al.  Comprehensive functional genomic resource and integrative model for the human brain , 2018, Science.

[4]  Dustin Tran,et al.  Simple, Distributed, and Accelerated Probabilistic Programming , 2018, NeurIPS.

[5]  Shiliang Sun,et al.  PAC-Bayes bounds for stable algorithms with instance-dependent priors , 2018, NeurIPS.

[6]  Gintare Karolina Dziugaite,et al.  Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors , 2017, ICML.

[7]  Gintare Karolina Dziugaite,et al.  Entropy-SGD optimizes the prior of a PAC-Bayes bound: Data-dependent PAC-Bayes priors via differential privacy , 2017, NeurIPS.

[8]  Ron Meir,et al.  Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory , 2017, ICML.

[9]  Nick C Fox,et al.  Analysis of shared heritability in common disorders of the brain , 2018, Science.

[10]  S. Horvath,et al.  Shared molecular neuropathology across major psychiatric disorders parallels polygenic overlap , 2016, Science.

[11]  M. Gerstein,et al.  Dependent Type Networks: A Probabilistic Logic via the Curry-Howard Correspondence in a System of Probabilistic Dependent Types , 2018 .

[12]  Dustin Tran,et al.  Deep Probabilistic Programming , 2017, ICLR.

[13]  Adji B. Dieng,et al.  Variational Inference via χ Upper Bound Minimization , 2017 .

[14]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[15]  Pierre Alquier,et al.  On the properties of variational approximations of Gibbs posteriors , 2015, J. Mach. Learn. Res..

[16]  Christoph H. Lampert,et al.  A PAC-Bayesian bound for Lifelong Learning , 2013, ICML.

[17]  Shiliang Sun,et al.  PAC-bayes bounds with data dependent priors , 2012, J. Mach. Learn. Res..

[18]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[19]  John Shawe-Taylor,et al.  Tighter PAC-Bayes Bounds , 2006, NIPS.