A novel method for combining Bayesian networks, theoretical analysis, and its applications

Effective knowledge integration plays a very important role in knowledge engineering and knowledge-based machine learning. The combination of Bayesian networks (BNs) has shown a promising technique in knowledge fusion and the way of combining BNs remains a challenging research topic. An effective method of BNs combination should not impose any particular constraints on the underlying BNs such that the method is applicable to a variety of knowledge engineering scenarios. In general, a sound method of BNs combination should satisfy three fundamental criteria, that is, avoiding cycles, preserving the conditional independencies of BN structures, and preserving the characteristics of individual BN parameters, respectively. However, none of the existing BNs combination method satisfies all the aforementioned criteria. Accordingly, there are only marginal theoretical contributions and limited practical values of previous research on BNs combination. In this paper, following the approach adopted by existing BNs combination methods, we assume that there is an ancestral ordering shared by individual BNs that helps avoid cycles. We first design and develop a novel BNs combination method that focuses on the following two aspects: (1) a generic method for combining BNs that does not impose any particular constraints on the underlying BNs, and (2) an effective approach ensuring that the last two criteria of BNs combination are satisfied. Further through a formal analysis, we compare the properties of the proposed method and that of three classical BNs combination methods, and hence to demonstrate the distinctive advantages of the proposed BNs combination method. Finally, we apply the proposed method in recommender systems for estimating users' ratings based on their implicit preferences, bank direct marketing for predicting clients' willingness of deposit subscription, and disease diagnosis for assessing patients' breast cancer risk.

[1]  Mark Claypool,et al.  Implicit interest indicators , 2001, IUI '01.

[2]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[3]  Serafín Moral,et al.  Qualitative combination of Bayesian networks , 2003, Int. J. Intell. Syst..

[4]  Qiang Ji,et al.  Learning Bayesian network parameters under incomplete data with domain knowledge , 2009, Pattern Recognit..

[5]  S. Y. Sohn,et al.  Experimental study for the comparison of classifier combination methods , 2007, Pattern Recognit..

[6]  Paul Sajda,et al.  Machine learning for detection and diagnosis of disease. , 2006, Annual review of biomedical engineering.

[7]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Stephen Shaoyi Liao,et al.  Sampling methods for summarizing unordered vehicle-to-vehicle data streams , 2012 .

[9]  Jean-Philippe Thiran,et al.  Information theoretic combination of pattern classifiers , 2010, Pattern Recognit..

[10]  Paulo Cortez,et al.  Using data mining for bank direct marketing: an application of the CRISP-DM methodology , 2011 .

[11]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[12]  Weiyi Liu,et al.  Constructing probabilistic graphical model from predicate formulas for fusing logical and probabilistic knowledge , 2011, Inf. Sci..

[13]  Jude W. Shavlik,et al.  Learning users' interests by unobtrusively observing their normal behavior , 2000, IUI '00.

[14]  Weiyi Liu,et al.  Recovering the Global Structure from Multiple Local Bayesian Networks , 2008, Int. J. Artif. Intell. Tools.

[15]  Michael P. Wellman,et al.  Graphical Representations of Consensus Belief , 1999, UAI.

[16]  Jon Atli Benediktsson,et al.  Fusion of Support Vector Machines for Classification of Multisensor Data , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Cory J. Butz,et al.  Constructing the Dependency Structure of a Multiagent Probabilistic Network , 2001, IEEE Trans. Knowl. Data Eng..

[18]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[19]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[20]  Richard Zeckhauser,et al.  Recommender systems for evaluating computer messages , 1997, CACM.

[21]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[22]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[23]  Géza Schay Introduction to probability with statistical applications , 2007 .

[24]  Kim-Leng Poh,et al.  Constructing Bayesian Network in a Changing World , 2005, AAAI Spring Symposium: Challenges to Decision Support in a Changing World.

[25]  Sam Kwong,et al.  A noise-detection based AdaBoost algorithm for mislabeled data , 2012, Pattern Recognit..

[26]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[27]  Robert A. Legenstein,et al.  Combining predictions for accurate recommender systems , 2010, KDD.

[28]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[30]  Franz Pernkopf,et al.  Stochastic margin-based structure learning of Bayesian network classifiers , 2013, Pattern Recognit..

[31]  Weiyi Liu,et al.  The fuzzy association degree in semantic data models , 2001, Fuzzy Sets Syst..

[32]  Stephen Shaoyi Liao,et al.  Aggregating and Sampling Methods for Processing GPS Data Streams for Traffic State Estimation , 2013, IEEE Transactions on Intelligent Transportation Systems.

[33]  Bruce Abramson,et al.  The Topological Fusion of Bayes Nets , 1992, UAI.

[34]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[35]  Lucila Ohno-Machado,et al.  A Comparison of Machine Learning Methods for the Diagnosis of Pigmented Skin Lesions , 2001, J. Biomed. Informatics.

[36]  Tze-Yun Leong,et al.  Pgmc: a Framework for Probabilistic Graphical Model Combination , 2005, AMIA.