Intelligent machines in the 21 st century : foundations of inference and inquiry

The last century saw the application of Boolean algebra toward the construction of computing machines, which work by applying logical transformations to information contained in their memory. The development of information theory and the generalization of Boolean algebra to Bayesian inference have enabled these computing machines, in the last quarter of the twentieth century, to be endowed with the ability to learn by making inferences from data. This revolution is just beginning as new computational techniques continue to make difficult problems more accessible. Recent advances in understanding the foundations of probability theory have revealed implications for areas other than logic. Of relevance to intelligent machines, we identified the algebra of questions as the free distributive algebra, which now allows us to work with questions in a way analogous to that which Boolean algebra enables us to work with logical statements. In this paper we begin with a history of inferential reasoning, highlighting key concepts that have led to the automation of inference in modern machine learning systems. We then discuss the foundations of inference in more detail using a modern viewpoint that relies on the mathematics of partially ordered sets and the scaffolding of lattice theory. This new viewpoint allows us to develop the logic of inquiry and introduce a measure describing the relevance of a proposed question to an unresolved issue. We will demonstrate the automation of inference, and discuss how this new logic of inquiry will enable intelligent machines to ask questions. Automation of both inference and inquiry promises to allow robots to perform science in the far reaches of our solar system and in other star systems by enabling them not only to make inferences from data, but also to decide which question to ask, experiment to perform, or measurement to take given what they have learned and what they are designed to understand.

[1]  Joan Lasenby,et al.  A unified mathematical language for physics and engineering in the 21st century , 2000, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[4]  Marvin H. J. Guber Bayesian Spectrum Analysis and Parameter Estimation , 1988 .

[5]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[6]  J. P. Burg,et al.  Maximum entropy spectral analysis. , 1967 .

[7]  R. Baierlein Probability Theory: The Logic of Science , 2004 .

[8]  E. Jaynes Entropy and Search Theory , 1985 .

[9]  R. Bhandari Maximum Entropy Spectral Analysis - Some Comments , 1978 .

[10]  M. Tribus,et al.  Energy and information , 1971 .

[11]  J. Skilling Maximum entropy and bayesian methods : 8 : 1988 , 1989 .

[12]  L. M. M.-T. Theory of Probability , 1929, Nature.

[13]  Kevin H. Knuth,et al.  Hierarchies of Models: Toward Understanding Planetary Nebulae , 2003 .

[14]  S. Luttrell The use of transinformation in the design of data sampling schemes for inverse problems , 1985 .

[15]  R. T. Cox,et al.  The Algebra of Probable Inference , 1962 .

[16]  George Boole,et al.  An Investigation of the Laws of Thought: Frontmatter , 2009 .

[17]  Brian A. Davey,et al.  An Introduction to Lattices and Order , 1989 .

[18]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[19]  From Euclid To Entropy , 1990 .

[20]  T. Loredo From Laplace to Supernova SN 1987A: Bayesian Inference in Astrophysics , 1990 .

[21]  Edwin T. Jaynes,et al.  Prior Probabilities , 1968, Encyclopedia of Machine Learning.

[22]  E. T. Jaynes,et al.  Where do we Stand on Maximum Entropy , 1979 .

[23]  Yoel Tikochinsky Consistency, amplitudes and probabilities in quantum theory , 2000 .

[24]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[25]  Saul Youssef Quantum mechanics as complex probability theory , 1993 .

[26]  D. Hestenes,et al.  Clifford Algebra to Geometric Calculus: A Unified Language for Mathematics and Physics , 1984 .

[27]  Kenneth M. Hanson,et al.  Introduction to Bayesian image analysis , 1993 .

[28]  J. Pierce,et al.  A New Look at the Relation between Information Theory and Search Theory , 1978 .

[29]  Myron Tribus,et al.  Thermostatics and thermodynamics : an introduction to energy, information and states of matter, with engineering applications , 1961 .

[30]  C. R. Smith,et al.  Probability Theory and the Associativity Equation , 1990 .

[31]  Robert L. Fry The engineering of cybernetic systems , 2002 .

[32]  The information revolution is yet to come (an homage to Claude E. Shannon) , 2002 .

[33]  Daniel A. Klain,et al.  Introduction to Geometric Probability , 1997 .

[34]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[35]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[36]  J. Skilling,et al.  Maximum-entropy and Bayesian methods in inverse problems , 1985 .

[37]  J. Skilling,et al.  Probabilistic data analysis: an introductory guide , 1998 .

[38]  Peter C. Cheeseman,et al.  In Defense of Probability , 1985, IJCAI.

[39]  Unreal Probabilities:Partial Truth with Clifford Numbers , 1998, physics/9808010.