Probabilistic Graphical Markov Model Learning: An Adaptive Strategy

In this paper an adaptive strategy to learn graphical Markov models is proposed to construct two algorithms. A statistical model complexity index (SMCI ) is defined and used to classify models in complexity classes, sparse, medium and dense. The first step of both algorithms is to fit a tree using the Chow and Liu algorithm. The second step begins calculating SMCI and using it to evaluate an index (EMUBI ) to predict the edges to add to the model. The first algorithm adds the predicted edges and stop, and the second, decides to add an edge when the fitting improves. The two algorithms are compared by an experimental design using models of different complexity classes. The samples to test the models are generated by a random sampler (MSRS). For the sparse class both algorithms obtain always the correct model. For the other two classes, efficiency of the algorithms is sensible to complexity.

[1]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[2]  Kazimierz Kuratowski Introduction to Calculus , 1964 .

[3]  C. Adami,et al.  Physical complexity of symbolic sequences , 1996, adap-org/9605002.

[4]  H. Akaike A new look at the statistical model identification , 1974 .

[5]  L. Levin,et al.  THE COMPLEXITY OF FINITE OBJECTS AND THE DEVELOPMENT OF THE CONCEPTS OF INFORMATION AND RANDOMNESS BY MEANS OF THE THEORY OF ALGORITHMS , 1970 .

[6]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[7]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[8]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[9]  S. Haberman,et al.  The analysis of frequency data , 1974 .

[10]  E. Díaz,et al.  Markov Structure Random Sampler (MSRS) Algorithm from Unrestricted Discrete Graphic Markov Models , 2006, 2006 Fifth Mexican International Conference on Artificial Intelligence.

[11]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[12]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[13]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.