论文信息 - A Gentle Introduction to The Universal Algorithmic Agent AIXI

A Gentle Introduction to The Universal Algorithmic Agent AIXI

Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff''s theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible. We outline for a number of problem classes, including sequence prediction, strategic games, function minimization, reinforcement and supervised learning, how the AIXI model can formally solve them. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXI$tl$, which is still effectively more intelligent than any other time $t$ and space $l$ bounded agent. The computation time of AIXI$tl$ is of the order $t \cdot 2^l$. Other discussed topics are formal definitions of intelligence order relations, the horizon problem and relations of the AIXI theory to other AI approaches.

Marcus Hutter

[1] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.

[2] D. Michie. GAME-PLAYING AND GAME-LEARNING AUTOMATA , 1966 .

[3] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[4] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .

[5] Robert P. Daley. Minimal-Program Complexity of Sequences with Restricted Resources , 1973, Inf. Control..

[6] G. Chaitin. A Theory of Program Size Formally Identical to Information Theory , 1975, JACM.

[7] Robert P. Daley. On the Inference of Optimal Descriptions , 1977, Theor. Comput. Sci..

[8] Ray J. Solomonoff,et al. Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[9] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .

[10] Carl H. Smith,et al. Inductive Inference: Theory and Methods , 1983, CSUR.

[11] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.

[12] Peter C. Cheeseman,et al. In Defense of Probability , 1985, IJCAI.

[13] Ker-I Ko,et al. On the Notion of Infinite Pseudorandom Sequences , 1986, Theor. Comput. Sci..

[14] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[15] Peter C. Cheeseman,et al. An inquiry into computer understanding , 1988, Comput. Intell..

[16] Michael Barr,et al. The Emperor's New Mind , 1989 .

[17] H. Stowell. The emperor's new mind R. Penrose, Oxford University Press, New York (1989) 466 pp. $24.95 , 1990, Neuroscience.

[18] R. T. Cox. Probability, frequency and reasonable expectation , 1990 .

[19] Ming Li,et al. Learning Simple Concept Under Simple Distributions , 1991, SIAM J. Comput..

[20] Ming Li,et al. Inductive Reasoning and Kolmogorov Complexity , 1992, J. Comput. Syst. Sci..

[21] Neri Merhav,et al. Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[22] Vladimir Vovk,et al. Universal Forecasting Algorithms , 1992, Inf. Comput..

[23] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[24] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..

[25] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[26] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[27] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[28] K. Upton,et al. A modern approach , 1995 .

[29] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[30] Melvin Fitting,et al. First-Order Logic and Automated Theorem Proving , 1990, Graduate Texts in Computer Science.

[31] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[32] Ivanoe De Falco,et al. Genetic Programming Estimates of Kolmogorov Complexity , 1997, ICGA.

[33] Jürgen Schmidhuber,et al. Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability , 1997, Neural Networks.

[34] Ray J. Solomonoff,et al. The Discovery of Algorithmic Probability , 1997, J. Comput. Syst. Sci..

[35] William G. Faris. Shadows of the Mind: A Search for the Missing Science of Consciousness , 1997 .

[36] Jorma Rissanen,et al. Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[37] Vladimir Vovk,et al. Universal portfolio selection , 1998, COLT' 98.

[38] Ray J. Solomonoff,et al. Two Kinds of Probabilistic Induction , 1999, Comput. J..

[39] Martin Schmidt. Time-Bounded Kolmogorov Complexity May Help in Search for Extra Terrestrial Intelligence (SETI) , 1999, Bull. EATCS.

[40] Marcus Hutter,et al. A Theory of Universal Artificial Intelligence based on Algorithmic Complexity , 2000, ArXiv.

[41] Jürgen Schmidhuber,et al. Gradient-based Reinforcement Planning in Policy-Search Methods , 2001, ArXiv.

[42] Marcus Hutter. New Error Bounds for Solomonoff Prediction , 2001, J. Comput. Syst. Sci..

[43] Jürgen Schmidhuber,et al. Market-Based Reinforcement Learning in Partially Observable Worlds , 2001, ICANN.

[44] Marcus Hutter. General Loss Bounds for Universal Sequence Prediction , 2001, ICML.

[45] Marcus Hutter,et al. Towards a Universal Theory of Artificial Intelligence Based on Algorithmic Probability and Sequential Decisions , 2000, ECML.

[46] Marcus Hutter. Universal sequential decisions in unknown environments , 2001 .

[47] Ofi rNw8x'pyzm,et al. The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002 .

[48] Galleria,et al. Optimality of Universal Bayesian Sequence Prediction , 2002 .

[49] Marcus Hutter. The Fastest and Shortest Algorithm for all Well-Defined Problems , 2002, Int. J. Found. Comput. Sci..

[50] Marcus Hutter,et al. Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures , 2002, COLT.

[51] Jürgen Schmidhuber,et al. Optimal Ordered Problem Solver , 2002, Machine Learning.

[52] Jürgen Schmidhuber,et al. Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.

[53] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[54] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.