Curriculum Learning of Bayesian Network Structures

I Bayesian Network (BN) . A directed acyclic graph (DAG) where nodes are random variables and directed edges represent probability dependencies among variables I BN Structure Learning . Firstly construct the topology (structure) of the network . Then estimate the parameters (CPDs) given the fixed structure I Curriculum Learning (CL) [Yoshua Bengio et al. ICML 2009 ] . Ideas: learn with the simpler samples or easier tasks as the start . Definition: a curriculum is a sequence of weighting schemes of the training data 〈W1,W2, . . . ,Wn〉, where W1 assigns more weight to easier samples, then each next scheme assigns more weight to harder samples, at last Wn assigns uniform weight to all samples

[1]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[2]  James Cussens,et al.  Advances in Bayesian Network Learning using Integer Programming , 2013, UAI.

[3]  Constantin F. Aliferis,et al.  Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery , 2003, METMBS.

[4]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[5]  E. Allgower,et al.  Numerical Continuation Methods , 1990 .

[6]  Changhe Yuan,et al.  An Improved Admissible Heuristic for Learning Optimal Bayesian Networks , 2012, UAI.

[7]  Tommi S. Jaakkola,et al.  Learning Bayesian Network Structure using LP Relaxations , 2010, AISTATS.

[8]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[9]  Changhe Yuan,et al.  Memory-Efficient Dynamic Programming for Learning Optimal Bayesian Networks , 2011, AAAI.

[10]  Changhe Yuan,et al.  Learning Optimal Bayesian Networks Using A* Search , 2011, IJCAI.

[11]  Brandon M. Malone,et al.  Impact of Learning Strategies on the Quality of Bayesian Networks: An Empirical Evaluation , 2015, UAI.

[12]  Valentin I. Spitkovsky,et al.  From Baby Steps to Leapfrog: How “Less is More” in Unsupervised Dependency Parsing , 2010, NAACL.

[13]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[14]  D. Roose,et al.  Numerical Continuation Methods : An Introduction E.L. Allgower, K. Georg R.L. Graham, J. Stoer, R. Varga (Eds.) Springer-Verlag, 1990, Springer Series in Computational Mathematics. Volume 13. Approx. 400 pages, 37 figures. Hardcover, DM 128,-ISBN 3-540-12760-7 , 1991 .

[15]  André Elisseeff,et al.  Using Markov Blankets for Causal Structure Learning , 2008, J. Mach. Learn. Res..

[16]  Tomi Silander,et al.  A Simple Approach for Finding the Globally Optimal Bayesian Network Structure , 2006, UAI.

[17]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[18]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[19]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[20]  Luis M. de Campos,et al.  A hybrid methodology for learning belief networks: BENEDICT , 2001, Int. J. Approx. Reason..

[21]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[22]  Kewei Tu,et al.  On the Utility of Curricula in Unsupervised Learning of Probabilistic Grammars , 2011, IJCAI.

[23]  Shiguang Shan,et al.  Self-Paced Curriculum Learning , 2015, AAAI.

[24]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[25]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[26]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[27]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[28]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.