A novel mutual dependence measure in structure learning

Mutual dependence between features plays an important role in the formulation of classifiers, clustering and other machine intelligent techniques. In this study a novel measure of mutual information known as integration to segregation (I2S), explaining the relationship between the two features is proposed. Some important characteristics of the proposed measure was investigated and its performance in terms of class imbalance measures was compared. It was shown that I2S possesses the characteristics, which are useful in controlling overfitting problems. In structure learning techniques such as Bayesian belief networks, conventional measures of dependency relationship cope with the overfitting problem by restricting the number of parents for a node; however it is still not impressive because complete overfitting is not eliminated. In contrast, I2S is capable of significantly maximizing the discriminant function with a better control of overfitting in the formulation of structure learning. J.Natn.Sci.Foundation Sri Lanka 2013 41 (3): 203-208 DOI: http://dx.doi.org/10.4038/jnsfsr.v41i3.6054

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Teemu Roos,et al.  Discriminative Learning of Bayesian Networks via Factorized Conditional Log-Likelihood , 2011, J. Mach. Learn. Res..

[3]  H. Akaike A new look at the statistical model identification , 1974 .

[4]  Kobra Etminani,et al.  Globally Optimal Structure Learning of Bayesian Networks from Data , 2010, ICANN.

[5]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[6]  Mikhail Nikulin,et al.  Non-parametric Tests for Complete Data: Bagdonavičius/Non-parametric Tests for Complete Data , 2011 .

[7]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[8]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[9]  Chunfeng Yang,et al.  A New Strategy for Model Order Identification and Its Application to Transfer Entropy for EEG Signals Analysis , 2013, IEEE Transactions on Biomedical Engineering.

[10]  Xue-wen Chen,et al.  Improving Bayesian Network Structure Learning with Mutual Information-Based Node Ordering in the K2 Algorithm , 2008, IEEE Transactions on Knowledge and Data Engineering.

[11]  J. Aldrich Correlations Genuine and Spurious in Pearson and Yule , 1995 .

[12]  Wai Lam,et al.  LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[13]  Joe Suzuki,et al.  Learning Bayesian Belief Networks Based on the MDL Principle : An Efficient Algorithm Using the Branch and Bound Technique , 1999 .

[14]  大西 仁,et al.  Pearl, J. (1988, second printing 1991). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan-Kaufmann. , 1994 .

[15]  Lei Yang,et al.  Bayesian Belief Network-based approach for diagnostics and prognostics of semiconductor manufacturing systems , 2012 .

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[18]  L. Wasserman All of Nonparametric Statistics , 2005 .

[19]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[20]  Gregory W. Corder,et al.  Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach , 2009 .