暂无分享,去创建一个
[1] Dominik Janzing,et al. Causal Regularization , 2019, NeurIPS.
[2] K. Hoover. The Logic of Causal Inference: Econometrics and the Conditional Analysis of Causation , 1990, Economics and Philosophy.
[3] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[4] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .
[5] Michael I. Jordan,et al. Gradient Descent Converges to Minimizers , 2016, ArXiv.
[6] Marco Loog,et al. Semi-Generative Modelling: Covariate-Shift Adaptation with Cause and Effect Features , 2018, AISTATS.
[7] Michael W. Mahoney,et al. PyHessian: Neural Networks Through the Lens of the Hessian , 2019, 2020 IEEE International Conference on Big Data (Big Data).
[8] Guodong Zhang,et al. Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model , 2019, NeurIPS.
[9] Bernhard Schölkopf,et al. Domain Generalization via Invariant Feature Representation , 2013, ICML.
[10] Suchi Saria,et al. Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport , 2018, AISTATS.
[11] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[12] Aurélien Lucchi,et al. Ellipsoidal Trust Region Methods and the Marginal Value of Hessian Information for Neural Network Training , 2019, ArXiv.
[13] Klaus-Robert Müller,et al. Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..
[14] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[15] H. Simon,et al. Causal Ordering and Identifiability , 1977 .
[16] Chi-Kwong Li. Geometric Means , 2003 .
[17] Bernhard Schölkopf,et al. Regression by dependence minimization and its application to causal inference in additive noise models , 2009, ICML '09.
[18] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[19] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.
[20] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[21] David D. Cox,et al. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.
[22] Bernhard Schölkopf,et al. On causal and anticausal learning , 2012, ICML.
[23] Dan Alistarh,et al. WoodFisher: Efficient second-order approximations for model compression , 2020, ArXiv.
[24] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[25] Christina Heinze-Deml,et al. Conditional variance penalties and domain shift robustness , 2017, Machine Learning.
[26] Bernhard Schölkopf,et al. Invariant Models for Causal Transfer Learning , 2015, J. Mach. Learn. Res..
[27] Christina Heinze-Deml,et al. Invariant Causal Prediction for Nonlinear Models , 2017, Journal of Causal Inference.
[28] Aaron C. Courville,et al. Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.
[29] Jonas Peters,et al. Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.
[30] Massoud Pedram,et al. Gradient Agreement as an Optimization Objective for Meta-Learning , 2018, ArXiv.
[31] Howard Barnum,et al. The Beginning of Infinity: Explanations That Transform the World , 2012 .
[32] Bernhard Schölkopf,et al. Learning Independent Causal Mechanisms , 2017, ICML.
[33] Yann Dauphin,et al. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.
[34] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[35] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[36] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[37] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[38] Alexandre M. Bayen,et al. Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.
[39] Srini Narayanan,et al. Stiffness: A New Perspective on Generalization in Neural Networks , 2019, ArXiv.
[40] Bernhard Schölkopf,et al. Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .
[41] Swami Sankaranarayanan,et al. MetaReg: Towards Domain Generalization using Meta-Regularization , 2018, NeurIPS.
[42] T. Haavelmo. The Statistical Implications of a System of Simultaneous Equations , 1943 .
[43] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[44] Shiqian Ma,et al. Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization , 2014, SIAM J. Optim..
[45] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[46] Yoshua Bengio,et al. Three Factors Influencing Minima in SGD , 2017, ArXiv.
[47] David Lopez-Paz,et al. Invariant Risk Minimization , 2019, ArXiv.
[48] Motoaki Kawanabe,et al. Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.
[49] Xiaoxia Wu,et al. AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization , 2018, ICML.
[50] Thomas Hofmann,et al. Escaping Saddles with Stochastic Gradients , 2018, ICML.
[51] Leonid Hurwicz,et al. On the Structural Form of Interdependent Systems , 1966 .
[52] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[53] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[54] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..
[55] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[56] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[57] Bernhard Schölkopf,et al. Causality for Machine Learning , 2019, ArXiv.
[58] Amit Dhurandhar,et al. Invariant Risk Minimization Games , 2020, ICML.
[59] David M. Blei,et al. Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..
[60] Greg Turk,et al. Learning Novel Policies For Tasks , 2019, ICML.
[61] Andrew K. Lampinen,et al. What shapes feature representations? Exploring datasets, architectures, and training , 2020, NeurIPS.
[62] Pascal Bianchi,et al. Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization , 2019, ArXiv.