Flat Minima
暂无分享,去创建一个
[1] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[2] C. S. Wallace,et al. An Information Measure for Classification , 1968, Comput. J..
[3] H. Akaike. Statistical predictor identification , 1970 .
[4] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .
[5] Peter Craven,et al. Smoothing noisy data with spline functions , 1978 .
[6] J. Rissanen,et al. Modeling By Shortest Data Description* , 1978, Autom..
[7] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[8] John E. Moody,et al. Fast Learning in Multi-Resolution Hierarchies , 1988, NIPS.
[9] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.
[10] Esther Levin,et al. A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.
[11] Timur Ash,et al. Dynamic node creation in backpropagation networks , 1989 .
[12] Michael J. Carter,et al. Operational Fault Tolerance of CMAC Networks , 1989, NIPS.
[13] B. Yandell. Spline smoothing and nonparametric regression , 1989 .
[14] M. C. Jones,et al. Spline Smoothing and Nonparametric Regression. , 1989 .
[15] Halbert White,et al. Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.
[16] Esther Levin,et al. A statistical approach to learning and generalization in layered neural networks , 1989, COLT '89.
[17] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[18] Barak A. Pearlmutter,et al. Chaitin-Kolmogorov Complexity and Generalization in Neural Networks , 1990, NIPS.
[19] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.
[20] Isabelle Guyon,et al. Structural Risk Minimization for Character Recognition , 1991, NIPS.
[21] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[22] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[23] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[24] D. Mackay,et al. A Practical Bayesian Framework for Backprop Networks , 1991 .
[25] Wray L. Buntine,et al. Bayesian Back-Propagation , 1991, Complex Syst..
[26] David Haussler,et al. Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise , 1991, COLT '91.
[27] Kiyotoshi Matsuoka,et al. Noise injection into inputs in back-propagation learning , 1992, IEEE Trans. Syst. Man Cybern..
[28] David J. C. MacKay,et al. Bayesian Interpolation , 1992, Neural Computation.
[29] Alan F. Murray,et al. Synaptic Weight Noise During MLP Learning Enhances Fault-Tolerance, Generalization and Learning Trajectory , 1992, NIPS.
[30] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[31] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.
[32] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.
[33] Chalapathy Neti,et al. Maximally fault tolerant neural networks , 1992, IEEE Trans. Neural Networks.
[34] John E. Moody,et al. Fast Pruning Using Principal Components , 1993, NIPS.
[35] Geoffrey E. Hinton,et al. Keeping Neural Networks Simple , 1993 .
[36] M. F. Møller,et al. Exact Calculation of the Product of the Hessian Matrix of Feed-Forward Network Error Functions and a Vector in 0(N) Time , 1993 .
[37] Christopher M. Bishop,et al. Curvature-driven smoothing: a learning algorithm for feedforward networks , 1993, IEEE Trans. Neural Networks.
[38] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[39] David H. Wolpert,et al. Bayesian Backpropagation Over I-O Functions Rather Than Weights , 1993, NIPS.
[40] F. Vallet,et al. Robustness in Multilayer Perceptrons , 1993, Neural Computation.
[41] Sean B. Holden,et al. On the theory of generalization and self-structuring in linearly weighted connectionist networks , 1993 .
[42] J. Urgen Schmidhuber. Discovering Problem Solutions with Low Kolmogorov Complexity and High Generalization Capability , 1994 .
[43] Achilleas Zapranis,et al. Stock performance modeling using neural networks: A comparative study with regression models , 1994, Neural Networks.
[44] Juergen Schmidhuber,et al. On learning how to learn learning strategies , 1994 .
[45] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[46] Ali A. Minai,et al. Perturbation response in feedforward networks , 1994, Neural Networks.
[47] John Moody,et al. Architecture Selection Strategies for Neural Networks: Application to Corporate Bond Rating Predicti , 1995, NIPS 1995.
[48] Peter M. Williams,et al. Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.
[49] J. Stephen Judd,et al. Optimal stopping and effective machine complexity in learning , 1993, Proceedings of 1995 IEEE International Symposium on Information Theory.
[50] David H. Wolpert,et al. The Relationship Between PAC, the Statistical Physics Framework, the Bayesian Framework, and the VC Framework , 1995 .
[51] Wolpert, D. (1994a). The relationship between PAC, the Statistical Physics framework, the Bayesian , .