论文信息 - Automatic model construction with Gaussian processes

Automatic model construction with Gaussian processes

This work was supported by the National Sciences and Engineering Research Council of Canada, the Cambridge Commonwealth Trust, Pembroke College, a grant from the Engineering and Physical Sciences Research Council, and a grant from Google.

David Duvenaud | D. Duvenaud

[1] J. Mercer. Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[2] R Bellman,et al. DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[3] Frank Rosenblatt,et al. PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[4] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[5] William M. Rand,et al. Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[6] P. Young,et al. Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[7] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[8] I. G. MacDonald,et al. Symmetric functions and Hall polynomials , 1979 .

[9] Temple F. Smith. Occam's razor , 1980, Nature.

[10] J. Nocedal. Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[11] E. K. Bowen,et al. Basic Statistics for Business and Economics , 1982 .

[12] John H. R. Maunsell,et al. The connections of the middle temporal visual area (MT) and their relationship to a cortical hierarchy in the macaque monkey , 1983, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[13] M. Degroot,et al. Highly Informative Priors , 1985 .

[14] M. A. Armstrong. Groups and symmetry , 1988 .

[15] G. Wahba. Spline models for observational data , 1990 .

[16] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[17] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[18] J. Lean,et al. Reconstruction of solar irradiance since 1610: Implications for climate change , 1995 .

[19] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[20] David A. McAllester,et al. Effective Bayesian Inference for Stochastic Programs , 1997, AAAI/IAAI.

[21] Ljup Co Todorovski. Declarative Bias in Equation Discovery , 1997 .

[22] S. MacEachern,et al. Estimating mixture of dirichlet process models , 1998 .

[23] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[24] D. B. Graham,et al. Characterising Virtual Eigensignatures for General Purpose Face Recognition , 1998 .

[25] I-Cheng Yeh,et al. Modeling of strength of high-performance concrete using artificial neural networks , 1998 .

[26] Christopher M. Bishop,et al. Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[27] H. Rabitz,et al. General foundations of high‐dimensional model representations , 1999 .

[28] Carl E. Rasmussen,et al. The Infinite Gaussian Mixture Model , 1999, NIPS.

[29] Takashi Washio,et al. Discovering Admissible Model Equations from Observed Data Based on Scale-Types and Identity Constrains , 1999, IJCAI.

[30] T. Plate. ACCURACY VERSUS INTERPRETABILITY IN FLEXIBLE MODELING : IMPLEMENTING A TRADEOFF USING GAUSSIAN PROCESS MODELS , 1999 .

[31] Vicki Bruce,et al. Face Recognition: From Theory to Applications , 1999 .

[32] Zoubin Ghahramani,et al. Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[33] Tom Minka,et al. Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[34] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[35] Eric R. Ziegel,et al. Generalized Linear Models , 2002, Technometrics.

[36] Chong Gu. Smoothing Spline Anova Models , 2002 .

[37] Carl E. Rasmussen,et al. Derivative Observations in Gaussian Process Models of Dynamic Systems , 2002, NIPS.

[38] Ofi rNw8x'pyzm,et al. The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002 .

[39] Marcus Hutter. The Fastest and Shortest Algorithm for all Well-Defined Problems , 2002, Int. J. Found. Comput. Sci..

[40] Carl E. Rasmussen,et al. Warped Gaussian Processes , 2003, NIPS.

[41] Radford M. Neal,et al. Density Modeling and Clustering Using Dirichlet Diffusion Trees , 2003 .

[42] Changshui Zhang,et al. Kernel Trick Embedded Gaussian Mixture Model , 2003, ALT.

[43] Neil D. Lawrence,et al. Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[44] Michael I. Jordan,et al. Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[45] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[46] Nir Friedman,et al. Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[47] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[48] Carl E. Rasmussen,et al. A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[49] Neil D. Lawrence,et al. Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[50] Zoubin Ghahramani,et al. Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[51] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.

[52] Stuart J. Russell,et al. BLOG: Probabilistic Models with Unknown Objects , 2005, IJCAI.

[53] Joaquin Quiñonero Candela,et al. Local distance preservation in the GP-LVM through back constraints , 2006, ICML.

[54] Yuesheng Xu,et al. Universal Kernels , 2006, J. Mach. Learn. Res..

[55] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .

[56] Robert M. Haralick,et al. Nonlinear Manifold Clustering By Dimensionality , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[57] Yuan Yao,et al. Mercer's Theorem, Feature Maps, and Smoothing , 2006, COLT.

[58] Edwin V. Bonilla,et al. Multi-task Gaussian Process Prediction , 2007, NIPS.

[59] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.

[60] Ryan P. Adams,et al. Bayesian Online Changepoint Detection , 2007, 0710.3742.

[61] Kevin P. Murphy,et al. Bayesian structure learning using dynamic programming and MCMC , 2007, UAI.

[62] Neil D. Lawrence,et al. Hierarchical Gaussian process latent variable models , 2007, ICML '07.

[63] Geoffrey E. Hinton,et al. Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes , 2007, NIPS.

[64] Laura Diosan,et al. Evolving kernel functions for SVMs by genetic programming , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[65] I. Kondor,et al. Group theoretical methods in machine learning , 2008 .

[66] Charles Kemp,et al. The discovery of structural form , 2008, Proceedings of the National Academy of Sciences.

[67] Pascal Fua,et al. Local deformation models for monocular 3D shape recovery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[68] Joshua B. Tenenbaum,et al. Church: a language for generative models , 2008, UAI.

[69] Iain Murray,et al. Introduction to Gaussian Processes , 2008 .

[70] Hod Lipson,et al. Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[71] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.

[72] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[73] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.

[74] Neil D. Lawrence,et al. Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[75] Trevor Darrell,et al. Rank priors for continuous non-linear dimensionality reduction , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[76] Ryan P. Adams,et al. Archipelago: nonparametric Bayesian semi-supervised learning , 2009, ICML '09.

[77] Francis R. Bach,et al. High-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learning , 2009, ArXiv.

[78] S. Sain,et al. Bayesian functional ANOVA modeling using Gaussian process prior distributions , 2010 .

[79] Steven Reece,et al. Sequential Bayesian Prediction in the Presence of Changepoints and Faults , 2010, Comput. J..

[80] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.

[81] Wu Bing,et al. A GP-based kernel construction and optimization method for RVM , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[82] Carl E. Rasmussen,et al. Gaussian Process Change Point Models , 2010, ICML.

[83] Klaus-Robert Müller,et al. Layer-wise analysis of deep networks with Gaussian kernels , 2010, NIPS.

[84] Neil D. Lawrence,et al. Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[85] David Ginsbourger,et al. Additive Kernels for Gaussian Process Modeling , 2011, 1103.4023.

[86] Ryan P. Adams,et al. Learning the Structure of Deep Sparse Graphical Models , 2009, AISTATS.

[87] Michael I. Jordan,et al. Learning Programs: A Hierarchical Bayesian Approach , 2010, ICML.

[88] Carl E. Rasmussen,et al. Gaussian Mixture Modeling with Gaussian Process Latent Variable Models , 2010, DAGM-Symposium.

[89] Carl E. Rasmussen,et al. Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[90] Carl E. Rasmussen,et al. Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[91] Noah D. Goodman,et al. Learning a theory of causality. , 2011, Psychological review.

[92] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[93] Zhenghao Chen,et al. On Random Weights and Unsupervised Feature Learning , 2011, ICML.