暂无分享,去创建一个
Chenhan D. Yu | Hans-Joachim Bungartz | Severin Reiz | George Biros | Chao Chen | Chenhan Yu | H. Bungartz | G. Biros | Severin Reiz | Chao Chen
[1] William B. March,et al. ASKIT: An Efficient, Parallel Library for High-Dimensional Kernel Summations , 2016, SIAM J. Sci. Comput..
[2] David A. Cohn,et al. Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.
[3] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[4] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[5] V. Rokhlin,et al. A fast direct solver for boundary integral equations in two dimensions , 2003 .
[6] Michael I. Jordan,et al. Stochastic Cubic Regularization for Fast Nonconvex Optimization , 2017, NeurIPS.
[7] D. Keyes,et al. Jacobian-free Newton-Krylov methods: a survey of approaches and applications , 2004 .
[8] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[9] Nicolas Le Roux,et al. Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods , 2017, AISTATS.
[10] Chenhan D. Yu,et al. Geometry-Oblivious FMM for Compressing Dense SPD Matrices , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[11] James Martens. Second-order Optimization for Neural Networks , 2016 .
[12] Richard Socher,et al. Block-diagonal Hessian-free Optimization for Training Neural Networks , 2017, ArXiv.
[13] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[14] Chenhan D. Yu,et al. Distributed-Memory Hierarchical Compression of Dense SPD Matrices , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[15] Michael Rabadi,et al. Kernel Methods for Machine Learning , 2015 .
[16] Satoshi Matsuoka,et al. Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Eric Darve,et al. Parallelization of the inverse fast multipole method with an application to boundary element method , 2020, Comput. Phys. Commun..
[18] Satoshi Matsuoka,et al. Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs , 2018, ArXiv.
[19] Lexing Ying,et al. Hierarchical Interpolative Factorization for Elliptic Operators: Integral Equations , 2013, 1307.2666.
[20] Guillaume Hennequin,et al. Fast Sampling-Based Inference in Balanced Neuronal Networks , 2014, NIPS.
[21] James Martens,et al. New Insights and Perspectives on the Natural Gradient Method , 2014, J. Mach. Learn. Res..
[22] Jorge Nocedal,et al. On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..
[23] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[24] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[25] Eric Darve,et al. A fast block low-rank dense solver with applications to finite-element matrices , 2014, J. Comput. Phys..
[26] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[27] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..
[28] Thomas O'Leary-Roseberry,et al. Inexact Newton Methods for Stochastic Non-Convex Optimization with Applications to Neural Network Training , 2019, 1905.06738.
[29] James Demmel,et al. ImageNet Training in Minutes , 2017, ICPP.
[30] W. Hackbusch,et al. Hierarchical Matrices: Algorithms and Analysis , 2015 .
[31] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[32] Peng Xu,et al. Sub-sampled Newton Methods with Non-uniform Sampling , 2016, NIPS.
[33] Léon Bottou,et al. Diagonal Rescaling For Neural Networks , 2017, ArXiv.
[34] Samuel Williams,et al. An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling , 2015, SIAM J. Sci. Comput..
[35] Piet Hut,et al. A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.
[36] Philip E. Gill,et al. Practical optimization , 1981 .
[37] Alexander G. Gray,et al. Fast High-dimensional Kernel Summations Using the Monte Carlo Multipole Method , 2008, NIPS.
[38] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[39] Jianlin Xia,et al. Fast algorithms for hierarchically semiseparable matrices , 2010, Numer. Linear Algebra Appl..
[40] Eric Darve,et al. A distributed-memory hierarchical solver for general sparse linear systems , 2017, Parallel Comput..
[41] Barbara Kaltenbacher,et al. Iterative Solution Methods , 2015, Handbook of Mathematical Methods in Imaging.
[42] Kurt Keutzer,et al. Hessian-based Analysis of Large Batch Training and Robustness to Adversaries , 2018, NeurIPS.
[43] Haishan Ye,et al. Approximate Newton Methods and Their Local Convergence , 2017, ICML.
[44] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.