Poison Identification Based on Bayesian Network: A Novel Improvement on K2 Algorithm via Markov Blanket

The purpose of this paper was to provide help for poison identification via the Bayesian network according to the observed preliminary symptoms of the poisoning people. We proposed a novel improvement on K2 algorithm to solve the problem of the lack of data under the special context. Determining initial node sequence of K2 algorithm via Markov blanket, we improved greatly Bayesian network structure learning with small datasets. Bootstrap data expansion and Gibbs data correction combining with maximum weight spanning tree (MWST) were used to expand the original small data set to further improve the performance and reliability of the structure learning. Then we applied this kind of combination scheme into a real data set to verify its validity and reliability. Finally we were able to quickly distinguish between a variety of biochemical reagents with this method, and the result of the inference can be used to guide emergency rescue after certain biochemical terrorism attack.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Remco R. Bouckaert,et al.  Optimizing Causal Orderings for Generating DAGs from Data , 1992, UAI.

[4]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[5]  T Okumura,et al.  Report on 640 victims of the Tokyo subway sarin attack. , 1996, Annals of emergency medicine.

[6]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[7]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..

[8]  Christopher Bishop,et al.  Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics , 2003 .

[9]  K. Varmuza,et al.  Systematic toxicological analysis: computer-assisted identification of poisons in biological materials. , 2003, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[10]  Constantin F. Aliferis,et al.  Towards Principled Feature Selection: Relevancy, Filters and Wrappers , 2003 .

[11]  Constantin F. Aliferis,et al.  Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery , 2003, METMBS.

[12]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[13]  Luis M. de Campos,et al.  A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service , 2004, Artif. Intell. Medicine.

[14]  D. Bravata,et al.  Anthrax: a systematic review of atypical presentations. , 2006, Annals of emergency medicine.

[15]  Gregory F. Cooper,et al.  Estimating the joint disease outbreak-detection time when an automated biosurveillance system is augmenting traditional clinical case finding , 2008, J. Biomed. Informatics.

[16]  Daniel B. Neill,et al.  Expectation-based scan statistics for monitoring spatial time series data , 2009 .