论文信息 - Numerical Exploration of Training Loss Level-Sets in Deep Neural Networks

Numerical Exploration of Training Loss Level-Sets in Deep Neural Networks

We present a computational method for empirically characterizing the training loss level-sets of deep neural networks. Our method numerically constructs a path in parameter space that is constrained to a set with a fixed near-zero training loss. By measuring regularization functions and test loss at different points within this path, we examine how different points in the parameter space with the same fixed training loss compare in terms of generalization ability. We also compare this method for finding regularized points with the more typical method, that uses objective functions which are weighted sums of training loss and regularization terms. We apply dimensionality reduction to the traversed paths in order to visualize the loss level sets in a well-regularized region of parameter space. Our results provide new information about the loss landscape of deep neural networks, as well as a new strategy for reducing test loss.

Garrett E. Katz | Naveed Tahir | Naveed Tahir

[1] Quynh Nguyen,et al. On Connected Sublevel Sets in Deep Learning , 2019, ICML.

[2] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[6] Harald Haas,et al. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[7] J. Ross Quinlan,et al. Combining Instance-Based and Model-Based Learning , 1993, ICML.

[8] Andrew Gordon Wilson,et al. Averaging Weights Leads to Wider Optima and Better Generalization , 2018, UAI.

[9] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[10] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[11] Chris Eliasmith,et al. Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems , 2004, IEEE Transactions on Neural Networks.