Learning dynamic Bayesian network models via cross-validation

We study cross-validation as a scoring criterion for learning dynamic Bayesian network models that generalize well. We argue that cross-validation is more suitable than the Bayesian scoring criterion for one of the most common interpretations of generalization. We confirm this by carrying out an experimental comparison of cross-validation and the Bayesian scoring criterion, as implemented by the Bayesian Dirichlet metric and the Bayesian information criterion. The results show that cross-validation leads to models that generalize better for a wide range of sample sizes.

[1]  Michael I. Jordan,et al.  Probabilistic Networks and Expert Systems , 1999 .

[2]  David Maxwell Chickering,et al.  A comparison of scientific and engineering criteria for Bayesian model selection , 2000, Stat. Comput..

[3]  Russell Greiner,et al.  Model Selection Criteria for Learning Belief Nets: An Empirical Comparison , 2000, ICML.

[4]  Min Zou,et al.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data , 2005, Bioinform..

[5]  H. Akaike A new look at the statistical model identification , 1974 .

[6]  David Page,et al.  Modelling regulatory pathways in E. coli from time series expression profiles , 2002, ISMB.

[7]  Marcel J. T. Reinders,et al.  A Comparison of Genetic Network Models , 2000, Pacific Symposium on Biocomputing.

[8]  Remco R. Bouckaert,et al.  Choosing Between Two Learning Algorithms Based on Calibrated Tests , 2003, ICML.

[9]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[10]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[11]  Aurélien Mazurie,et al.  Gene networks inference using dynamic Bayesian networks , 2003, ECCB.

[12]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[13]  Volker Tresp,et al.  Nonlinear Markov Networks for Continuous Variables , 1997, NIPS.

[14]  Andrew Y. Ng,et al.  Preventing "Overfitting" of Cross-Validation Data , 1997, ICML.

[15]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[18]  Kevin Murphy,et al.  Modelling Gene Expression Data using Dynamic Bayesian Networks , 2006 .

[19]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[20]  Tommi S. Jaakkola,et al.  On the Dirichlet Prior and Bayesian Regularization , 2002, NIPS.

[21]  Michael J. Pazzani,et al.  Searching for Dependencies in Bayesian Classifiers , 1995, AISTATS.

[22]  Eamonn J. Keogh,et al.  Learning the Structure of Augmented Bayesian Classifiers , 2002, Int. J. Artif. Intell. Tools.

[23]  Carsten Peterson,et al.  Random Boolean network models and the yeast transcriptional network , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Kevin P. Murphy,et al.  Learning the Structure of Dynamic Probabilistic Networks , 1998, UAI.

[25]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[26]  Satoru Miyano,et al.  Dynamic Bayesian Network and Nonparametric Regression for Nonlinear Modeling of Gene Networks from Time Series Gene Expression Data , 2003, CMSB.

[27]  Dirk Husmeier,et al.  Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks , 2003, Bioinform..