Exploiting monotonicity via logistic regression in Bayesian network learning

An important challenge in machine learning is to find ways of learning quickly from very small amounts of training data. The only way to learn from small data samples is to constrain the learning process by exploiting background knowledge. In this report, we present a theoretical analysis on the use of constrained logistic regression for estimating conditional probability distribution in Bayesian Networks (BN) by using background knowledge in the form of qualitative monotonicity statements. Such background knowledge is treated as a set of constraints on the parameters of a logistic function during training. Our goal of finding the appropriate BN model is two-fold: (a) we want to exploit any monotonic relationship between random variables that may generally exist as domain knowledge and (b) we want to be able to address the problem of estimating the conditional distribution of a random variable with a large number of parents. We discuss variants of the logistic regression model and present an analysis on the corresponding constraints required to implement monotonicity. More importantly, we outline the problem in some of these variants in terms of the number of parameters and constraints which, in some cases, can grow exponentially with the number of parent variables. To address this problem, we present two variants of the constrained logistic regression model, M2b CLR and M3 CLR, in which the number of constraints required to implement monotonicity does not grow exponentially with the number of parents hence providing a practicable method for estimating conditional probabilities with very sparse data.

[1]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[2]  Thomas G. Dietterich,et al.  Learning from Sparse Data by Exploiting Monotonicity Constraints , 2005, UAI.

[3]  Henry Tirri,et al.  On Discriminative Bayesian Network Classifiers and Logistic Regression , 2005, Machine Learning.

[4]  Bin Shen,et al.  Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers , 2002, Machine Learning.

[5]  A. Ben-David Monotonicity Maintenance in Information-Theoretic Machine Learning Algorithms , 1995, Machine Learning.

[6]  L. Ungar,et al.  Deriving Monotonic Function Envelopes from Observations , 2003 .

[7]  A. J. Feelders,et al.  Classification trees for problems with monotonicity constraints , 2002, SKDD.

[8]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[9]  Hennie Daniels,et al.  Integrating economic knowledge in data mining algorithms , 2001 .

[10]  A. J. Feelders Prior Knowledge in Economic Applications of Data Mining , 2000, PKDD.

[11]  H. Daniels,et al.  Application of MLP Networks to Bond Rating and House Pricing , 1999, Neural Computing & Applications.

[12]  Shouhong Wang,et al.  Application of the Back Propagation Neural Network Algorithm with Monotonicity Constraints for Two‐Group Classification Problems* , 1993 .

[13]  Stan Matwin,et al.  Using Qualitative Models to Guide Inductive Learning , 1993, ICML.

[14]  Michael P. Wellman Fundamental Concepts of Qualitative Probabilistic Networks , 1990, Artif. Intell..

[15]  Alan Agresti,et al.  Bayesian and Maximum Likelihood Approaches to Order-Restricted Inference for Models for Ordinal Categorical Data , 1986 .