Learning explanations that are hard to vary
暂无分享,去创建一个
B. Schölkopf | Giambattista Parascandolo | Alexander Neitz | Antonio Orvieto | Luigi Gresele | B. Scholkopf
[1] Aaron C. Courville,et al. Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.
[2] Andrew Kyle Lampinen,et al. What shapes feature representations? Exploring datasets, architectures, and training , 2020, NeurIPS.
[3] Dan Alistarh,et al. WoodFisher: Efficient second-order approximations for model compression , 2020, ArXiv.
[4] Kush R. Varshney,et al. Invariant Risk Minimization Games , 2020, ICML.
[5] Michael W. Mahoney,et al. PyHessian: Neural Networks Through the Lens of the Hessian , 2019, 2020 IEEE International Conference on Big Data (Big Data).
[6] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[7] Christina Heinze-Deml,et al. Conditional variance penalties and domain shift robustness , 2017, Machine Learning.
[8] Bernhard Schölkopf,et al. Causality for Machine Learning , 2019, ArXiv.
[9] Anas Barakat,et al. Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization , 2019, ArXiv.
[10] Guodong Zhang,et al. Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model , 2019, NeurIPS.
[11] David Lopez-Paz,et al. Invariant Risk Minimization , 2019, ArXiv.
[12] Dominik Janzing,et al. Causal Regularization , 2019, NeurIPS.
[13] Xiaoxia Wu,et al. AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization , 2018, ICML.
[14] Aurélien Lucchi,et al. Ellipsoidal Trust Region Methods and the Marginal Value of Hessian Information for Neural Network Training , 2019, ArXiv.
[15] Greg Turk,et al. Learning Novel Policies For Tasks , 2019, ICML.
[16] Srini Narayanan,et al. Stiffness: A New Perspective on Generalization in Neural Networks , 2019, ArXiv.
[17] Suchi Saria,et al. Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport , 2018, AISTATS.
[18] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[19] Marco Loog,et al. Semi-Generative Modelling: Covariate-Shift Adaptation with Cause and Effect Features , 2018, AISTATS.
[20] Swami Sankaranarayanan,et al. MetaReg: Towards Domain Generalization using Meta-Regularization , 2018, NeurIPS.
[21] Massoud Pedram,et al. Gradient Agreement as an Optimization Objective for Meta-Learning , 2018, ArXiv.
[22] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[23] Thomas Hofmann,et al. Escaping Saddles with Stochastic Gradients , 2018, ICML.
[24] Bernhard Schölkopf,et al. Learning Independent Causal Mechanisms , 2017, ICML.
[25] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.
[26] Christina Heinze-Deml,et al. Invariant Causal Prediction for Nonlinear Models , 2017, Journal of Causal Inference.
[27] Yann Dauphin,et al. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.
[28] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[29] Bernhard Schölkopf,et al. Invariant Models for Causal Transfer Learning , 2015, J. Mach. Learn. Res..
[30] Bernhard Schölkopf,et al. Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .
[31] Yoshua Bengio,et al. Three Factors Influencing Minima in SGD , 2017, ArXiv.
[32] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[33] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[34] David M. Blei,et al. Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..
[35] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[36] Shiqian Ma,et al. Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization , 2014, SIAM J. Optim..
[37] Michael I. Jordan,et al. Gradient Descent Converges to Minimizers , 2016, ArXiv.
[38] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..
[39] Alexandre M. Bayen,et al. Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.
[40] Jonas Peters,et al. Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.
[41] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[42] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[43] David D. Cox,et al. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.
[44] Bernhard Schölkopf,et al. Domain Generalization via Invariant Feature Representation , 2013, ICML.
[45] Howard Barnum,et al. The Beginning of Infinity: Explanations That Transform the World , 2012 .
[46] Bernhard Schölkopf,et al. On causal and anticausal learning , 2012, ICML.
[47] Motoaki Kawanabe,et al. Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.
[48] Bernhard Schölkopf,et al. Regression by dependence minimization and its application to causal inference in additive noise models , 2009, ICML '09.
[49] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[50] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[51] Klaus-Robert Müller,et al. Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..
[52] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[53] Chi-Kwong Li. Geometric Means , 2003 .
[54] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[55] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[56] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[57] K. Hoover. The Logic of Causal Inference: Econometrics and the Conditional Analysis of Causation , 1990, Economics and Philosophy.
[58] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[59] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .
[60] H. Simon,et al. Causal Ordering and Identifiability , 1977 .
[61] Leonid Hurwicz,et al. On the Structural Form of Interdependent Systems , 1966 .
[62] T. Haavelmo. The Statistical Implications of a System of Simultaneous Equations , 1943 .