A fast calculation of metric scores for learning Bayesian network

Frequent counting is a very so often required operation in machine learning algorithms. A typical machine learning task, learning the structure of Bayesian network (BN) based on metric scoring, is introduced as an example that heavily relies on frequent counting. A fast calculation method for frequent counting enhanced with two cache layers is then presented for learning BN. The main contribution of our approach is to eliminate comparison operations for frequent counting by introducing a multi-radix number system calculation. Both mathematical analysis and empirical comparison between our method and state-of-the-art solution are conducted. The results show that our method is dominantly superior to state-of-the-art solution in solving the problem of learning BN.

[1]  Philippe Besnard,et al.  Symbolic and Quantitative Approaches to Reasoning with Uncertainty , 2013, Lecture Notes in Computer Science.

[2]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[3]  Siegfried Nijssen,et al.  Mining optimal decision trees from itemset lattices , 2007, KDD '07.

[4]  José Manuel Gutiérrez,et al.  Learning Bayesian Networks , 1997 .

[5]  Jose Miguel Puerta,et al.  Stochastic Local Algorithms for Learning Belief Networks: Searching in the Space of the Orderings , 2001, ECSQARU.

[6]  Simon Parsons,et al.  Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[7]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[8]  Qiang Ding,et al.  Association Rule Mining on Remotely Sensed Images Using P-trees , 2002, PAKDD.

[9]  Andrew W. Moore,et al.  Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets , 1998, J. Artif. Intell. Res..

[10]  Pedro M. Domingos,et al.  Dynamic Probabilistic Relational Models , 2003, IJCAI.

[11]  Yanxi Liu,et al.  Texture replacement in real images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Heikki Mannila,et al.  Multiple Uses of Frequent Sets and Condensed Representations (Extended Abstract) , 1996, KDD.

[13]  D UllmanJeffrey,et al.  Implementing data cubes efficiently , 1996 .

[14]  Alan F. Karr,et al.  Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues , 2003, Stat. Comput..

[15]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[16]  Jose Miguel Puerta,et al.  Ant colony optimization for learning Bayesian networks , 2002, Int. J. Approx. Reason..

[17]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[18]  Andrew W. Moore,et al.  Efficient Locally Weighted Polynomial Regression Predictions , 1997, ICML.

[20]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[21]  R. Nichol,et al.  The Edinburgh/Durham Southern Galaxy Catalogue , 1992 .

[22]  Andrew W. Moore,et al.  Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning , 2003, ICML.

[23]  R. Nichol,et al.  The Edinburgh/Durham Southern Galaxy Catalogue - IX. The Galaxy Catalogue , 2000, astro-ph/0008184.

[24]  Andrew W. Moore,et al.  Real-valued All-Dimensions Search: Low-overhead Rapid Searching over Subsets of Attributes , 2002, UAI.

[25]  Jeff G. Schneider,et al.  Anomaly pattern detection in categorical datasets , 2008, KDD.

[26]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[27]  Roberto Battiti,et al.  Reactive Local Search for the Maximum Clique Problem1 , 2001, Algorithmica.

[28]  Andrew W. Moore,et al.  A Dynamic Adaptation of AD-trees for Efficient Machine Learning on Large Data Sets , 2000, ICML.

[29]  R. Rajaram,et al.  Effective and efficient feature selection for large-scale data using Bayes’ theorem , 2009, Int. J. Autom. Comput..

[30]  Naren Ramakrishnan,et al.  Algorithms for Storytelling , 2008, IEEE Trans. Knowl. Data Eng..

[31]  Stephen M. Omohundro,et al.  Efficient Algorithms with Neural Network Behavior , 1987, Complex Syst..

[32]  Philip S. Yu,et al.  Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining , 2002 .

[33]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[34]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.