论文信息 - A fast calculation of metric scores for learning Bayesian network

A fast calculation of metric scores for learning Bayesian network

Frequent counting is a very so often required operation in machine learning algorithms. A typical machine learning task, learning the structure of Bayesian network (BN) based on metric scoring, is introduced as an example that heavily relies on frequent counting. A fast calculation method for frequent counting enhanced with two cache layers is then presented for learning BN. The main contribution of our approach is to eliminate comparison operations for frequent counting by introducing a multi-radix number system calculation. Both mathematical analysis and empirical comparison between our method and state-of-the-art solution are conducted. The results show that our method is dominantly superior to state-of-the-art solution in solving the problem of learning BN.

[1] Philippe Besnard,et al. Symbolic and Quantitative Approaches to Reasoning with Uncertainty , 2013, Lecture Notes in Computer Science.

[2] Thomas Stützle,et al. Stochastic Local Search: Foundations & Applications , 2004 .

[3] Siegfried Nijssen,et al. Mining optimal decision trees from itemset lattices , 2007, KDD '07.

[4] José Manuel Gutiérrez,et al. Learning Bayesian Networks , 1997 .

[5] Jose Miguel Puerta,et al. Stochastic Local Algorithms for Learning Belief Networks: Searching in the Space of the Orderings , 2001, ECSQARU.

[6] Simon Parsons,et al. Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[7] Constantin F. Aliferis,et al. The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[8] Qiang Ding,et al. Association Rule Mining on Remotely Sensed Images Using P-trees , 2002, PAKDD.

[9] Andrew W. Moore,et al. Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets , 1998, J. Artif. Intell. Res..

[10] Pedro M. Domingos,et al. Dynamic Probabilistic Relational Models , 2003, IJCAI.

[11] Yanxi Liu,et al. Texture replacement in real images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12] Heikki Mannila,et al. Multiple Uses of Frequent Sets and Condensed Representations (Extended Abstract) , 1996, KDD.

[13] D UllmanJeffrey,et al. Implementing data cubes efficiently , 1996 .

[14] Alan F. Karr,et al. Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues , 2003, Stat. Comput..

[15] David Maxwell Chickering,et al. Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[16] Jose Miguel Puerta,et al. Ant colony optimization for learning Bayesian networks , 2002, Int. J. Approx. Reason..

[17] Gregory F. Cooper,et al. A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[18] Andrew W. Moore,et al. Efficient Locally Weighted Polynomial Regression Predictions , 1997, ICML.

[20] Gregory F. Cooper,et al. The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[21] R. Nichol,et al. The Edinburgh/Durham Southern Galaxy Catalogue , 1992 .

[22] Andrew W. Moore,et al. Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning , 2003, ICML.

[23] R. Nichol,et al. The Edinburgh/Durham Southern Galaxy Catalogue - IX. The Galaxy Catalogue , 2000, astro-ph/0008184.

[24] Andrew W. Moore,et al. Real-valued All-Dimensions Search: Low-overhead Rapid Searching over Subsets of Attributes , 2002, UAI.