Discretisation Does Affect the Performance of Bayesian Networks

In this paper, we study the use of Bayesian networks to interpret breast X-ray images in the context of breast-cancer screening. In particular, we investigate the performance of a manually developed Bayesian network under various discretisation schemes to check whether the probabilistic parameters in the initial manual network with continuous features are optimal and correctly reflect the reality. The classification performance was determined using ROC analysis. A few algorithms perform better than the continuous baseline: best was the entropy-based method of Fayyad and Irani, but also simpler algorithms did outperform the continuous baseline. Two simpler methods with only 3 bins per variable gave results similar to the continuous baseline. These results indicate that it is worthwhile to consider discretising continuous data when developing Bayesian networks and support the practical importance of probabilitistic parameters in determining the network’s performance.

[1]  P. Langenberg,et al.  Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. , 2000, AJR. American journal of roentgenology.

[2]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[3]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[4]  Geoffrey I. Webb,et al.  Proportional k-Interval Discretization for Naive-Bayes Classifiers , 2001, ECML.

[5]  Gregory M. Provan,et al.  The Sensitivity of Belief Networks to Imprecise Probabilities: An Experimental Investigation , 1996, Artif. Intell..

[6]  Rebecca S Lewis,et al.  Does training in the Breast Imaging Reporting and Data System (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? , 2002, Radiology.

[7]  Victor Ciesielski,et al.  An Empirical Investigation of the Impact of Discretization on Common Data Distributions , 2003, HIS.

[8]  Luis M. de Campos,et al.  A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service , 2004, Artif. Intell. Medicine.

[9]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[10]  Peter J. F. Lucas,et al.  Critiquing Knowledge Representation in Medical Image Interpretation Using Structure Learning , 2010, KR4HC.

[11]  S. S. Iyengar,et al.  A comparative analysis of discretization methods for Medical Datamining with Naive Bayesian classifier , 2006, 9th International Conference on Information Technology (ICIT'06).

[12]  C. D. Page,et al.  Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings. , 2009, Radiology.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Pedro Larrañaga,et al.  Wrapper discretization by means of estimation of distribution algorithms , 2007, Intell. Data Anal..

[15]  M. Mizianty,et al.  Comparative Analysis of the Impact of Discretization on the Classification with Naïve Bayes and Semi-Naïve Bayes Classifiers , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[16]  P Haddawy,et al.  Construction of a Bayesian network for mammographic diagnosis of breast cancer , 1997, Comput. Biol. Medicine.

[17]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[18]  Pierre Geurts,et al.  Investigation and Reduction of Discretization Variance in Decision Tree Induction , 2000, ECML.

[19]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[20]  Marek J. Druzdzel,et al.  Are Bayesian Networks Sensitive to Precision of Their Parameters , 2008 .