Discovering Problem Solutions with Low Kolmogorov Complexity and High Generalization Capability
暂无分享,去创建一个
[1] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..
[2] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.
[3] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .
[4] C. S. Wallace,et al. An Information Measure for Classification , 1968, Comput. J..
[5] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences: statistical considerations , 1969, JACM.
[6] L. Levin,et al. THE COMPLEXITY OF FINITE OBJECTS AND THE DEVELOPMENT OF THE CONCEPTS OF INFORMATION AND RANDOMNESS BY MEANS OF THE THEORY OF ALGORITHMS , 1970 .
[7] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .
[8] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[9] G. Chaitin. A Theory of Program Size Formally Identical to Information Theory , 1975, JACM.
[10] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .
[11] J. Rissanen,et al. Modeling By Shortest Data Description* , 1978, Autom..
[12] Juris Hartmanis,et al. Generalized Kolmogorov complexity and the structure of feasible computations , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).
[13] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[14] J. Rissanen. A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .
[15] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.
[16] Leonid A. Levin,et al. Randomness Conservation Inequalities; Information and Independence in Mathematical Theories , 1984, Inf. Control..
[17] Paul E. Utgoff,et al. Shift of bias for inductive concept learning , 1984 .
[18] Yann LeCun,et al. Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .
[19] Ray J. Solomonoff,et al. The Application of Algorithmic Probability to Problems in Artificial Intelligence , 1985, UAI.
[20] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[21] J. Rissanen. Stochastic Complexity and Modeling , 1986 .
[22] Gregory. J. Chaitin,et al. Algorithmic information theory , 1987, Cambridge tracts in theoretical computer science.
[23] David Haussler,et al. Occam's Razor , 1987, Inf. Process. Lett..
[24] Ralph Linsker,et al. Self-organization in a perceptual network , 1988, Computer.
[25] David Haussler,et al. Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..
[26] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[27] Ming Li,et al. The Minimum Description Length Principle and Its Application to Online Learning of Handprinted Characters , 1989, IJCAI.
[28] Ming Li,et al. A theory of learning simple concepts under simple distributions and average case complexity for the universal distribution , 1989, 30th Annual Symposium on Foundations of Computer Science.
[29] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[30] C. Watkins. Learning from delayed rewards , 1989 .
[31] Jürgen Schmidhuber,et al. A local learning algorithm for dynamic feedforward and recurrent networks , 1990, Forschungsberichte, TU Munich.
[32] Ronald L. Rivest,et al. Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..
[33] Thomas G. Dietterich. Limitations on Inductive Learning , 1989, ML.
[34] P. Gács,et al. KOLMOGOROV'S CONTRIBUTIONS TO INFORMATION THEORY AND ALGORITHMIC COMPLEXITY , 1989 .
[35] Ray J. Solomonofi,et al. A SYSTEM FOR INCREMENTAL LEARNING BASED ON ALGORITHMIC PROBABILITY , 1989 .
[36] Edwin P. D. Pednault,et al. Some Experiments in Applying Inductive Inference Principles to Surface Reconstruction , 1989, IJCAI.
[37] Yann LeCun,et al. Second Order Properties of Error Surfaces: Learning Time and Generalization , 1990, NIPS 1990.
[38] David E. Rumelhart,et al. Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..
[39] Barak A. Pearlmutter,et al. Chaitin-Kolmogorov Complexity and Generalization in Neural Networks , 1990, NIPS.
[40] Isabelle Guyon,et al. Structural Risk Minimization for Character Recognition , 1991, NIPS.
[41] Suzanna Becker,et al. Unsupervised Learning Procedures for Neural Networks , 1991, Int. J. Neural Syst..
[42] John R. Koza,et al. Genetic evolution and co-evolution of computer programs , 1991 .
[43] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[44] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[45] Andrew R. Barron,et al. Complexity Regularization with Application to Artificial Neural Networks , 1991 .
[46] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[47] D. Mackay,et al. A Practical Bayesian Framework for Backprop Networks , 1991 .
[48] Osamu Watanabe,et al. Kolmogorov Complexity and Computational Complexity , 2012, EATCS Monographs on Theoretical Computer Science.
[49] Zhaoping Li,et al. Understanding Retinal Color Coding from First Principles , 1992, Neural Computation.
[50] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[51] E. Allender. Applications of Time-Bounded Kolmogorov Complexity in Complexity Theory , 1992 .
[52] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[53] Jürgen Schmidhuber,et al. Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.
[54] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.
[55] Rolf Herken,et al. The Universal Turing Machine: A Half-Century Survey , 1992 .
[56] Geoffrey E. Hinton,et al. Keeping Neural Networks Simple , 1993 .
[57] J. Schmidhuber. Reducing the Ratio Between Learning Complexity and Number of Time Varying Variables in Fully Recurrent Nets , 1993 .
[58] Jürgen Schmidhuber,et al. A ‘Self-Referential’ Weight Matrix , 1993 .
[59] Gustavo Deco,et al. Elimination of Overtraining by a Mutual Information Network , 1993 .
[60] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[61] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[62] A. Milosavljevic,et al. Discovery by Minimal Length Encoding: A Case Study in Molecular Evolution , 2004, Machine Learning.
[63] D. Wolpert. On Overfitting Avoidance as Bias , 1993 .
[64] Jürgen Schmidhuber,et al. Simplifying Neural Nets by Discovering Flat Minima , 1994, NIPS.
[65] P. Dayan,et al. TD(λ) converges with probability 1 , 2004, Machine Learning.
[66] Wolfgang Maass,et al. Perspectives of Current Research about the Complexity of Learning on Neural Nets , 1994 .
[67] Wolfgang J. Paul,et al. Autonomous theory building systems , 1995, Ann. Oper. Res..