A Traveling Salesman Learns Bayesian Networks

Structure learning of Bayesian networks is an important problem that arises in numerous machine learning applications. In this work, we present a novel approach for learning the structure of Bayesian networks using the solution of an appropriately constructed traveling salesman problem. In our approach, one computes an optimal ordering (partially ordered set) of random variables using methods for the traveling salesman problem. This ordering significantly reduces the search space for the subsequent greedy optimization that computes the final structure of the Bayesian network. We demonstrate our approach of learning Bayesian networks on real world census and weather datasets. In both cases, we demonstrate that the approach very accurately captures dependencies between random variables. We check the accuracy of the predictions based on independent studies in both application domains.

[1]  D. Heckerman,et al.  Toward Normative Expert Systems: Part I The Pathfinder Project , 1992, Methods of Information in Medicine.

[2]  Susan M. Carlson,et al.  Trends in Race/Sex Occupational Inequality: Conceptual and Measurement Issues , 1992 .

[3]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[4]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[5]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[6]  Tod S. Levitt,et al.  Model-Based Influence Diagrams for Machine Vision , 1989, UAI.

[7]  Robert I. Lerman,et al.  How do marital status, work effort, and wage rates interact? , 2007, Demography.

[8]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[9]  Mark Blaug,et al.  The correlation between education and earnings: What does it signify? , 1947 .

[10]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[11]  Andrew W. Moore,et al.  Finding optimal Bayesian networks by dynamic programming , 2005 .

[12]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[13]  Basilio Sierra,et al.  On the use of Bayesian Networks to develop behaviours for mobile robots , 2007, Robotics Auton. Syst..

[14]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[15]  G. Reinelt The traveling salesman: computational solutions for TSP applications , 1994 .

[16]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[17]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[18]  N W Burton,et al.  Occupation, hours worked, and leisure-time physical activity. , 2000, Preventive medicine.

[19]  J. Joesch,et al.  Work and Family: Marriage, Children, Child Gender and the Work Hours and Earnings of West German Men , 2005, SSRN Electronic Journal.

[20]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[21]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[22]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[23]  Daphne Koller,et al.  Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks , 2005, UAI.

[24]  Andrzej Banaszuk,et al.  Hearing the clusters of a graph: A distributed algorithm , 2009, Autom..

[25]  Brian W. Kernighan,et al.  An Effective Heuristic Algorithm for the Traveling-Salesman Problem , 1973, Oper. Res..

[26]  Mark Blaug THE CORRELATION BETWEEN EDUCATION AND EARNINGS: WHAT DOES IT SIGNIFY? , 1947 .

[27]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[28]  Keld Helsgaun,et al.  An effective implementation of the Lin-Kernighan traveling salesman heuristic , 2000, Eur. J. Oper. Res..

[29]  Robert M. Fung,et al.  Applying Bayesian networks to information retrieval , 1995, CACM.

[30]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[31]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[32]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[33]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.