Targeting: Logistic Regression, Special Cases and Extensions

Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.

[1]  Steffen L. Lauritzen,et al.  Graphical models in R , 1996 .

[2]  A. Journel Combining Knowledge from Diverse Sources: An Alternative to Traditional Data Independence Hypotheses , 2002 .

[3]  Dimitris Kanellopoulos,et al.  Handling imbalanced datasets: A review , 2006 .

[4]  J. Chilès,et al.  Geostatistics: Modeling Spatial Uncertainty , 1999 .

[5]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[6]  Marzuki Khalid,et al.  A TWO-STEP SUPERVISED LEARNING ARTIFICIAL NEURAL NETWORK FOR IMBALANCED DATASET PROBLEMS , 2012 .

[7]  Minfeng Deng,et al.  A Conditional Dependence Adjusted Weights of Evidence Model , 2009 .

[8]  Helmut Schaeben,et al.  A Mathematical View of Weights-of-Evidence, Conditional Independence, and Logistic Regression in Terms of Markov Random Fields , 2014, Mathematical Geosciences.

[9]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[10]  Q. Cheng,et al.  Conditional Independence Test for Weights-of-Evidence Modeling , 2002 .

[11]  Ben Taskar,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[12]  F. Agterberg,et al.  Statistical Pattern Integration for Mineral Exploration , 1990 .

[13]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[14]  Zhong-Qiu Zhao,et al.  A novel modular neural network for imbalanced classification problems , 2009, Pattern Recognit. Lett..

[15]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[16]  Dennis P. Cox,et al.  Mineral deposit models , 1986 .

[17]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[18]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[19]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[20]  L. Stein,et al.  Probability and the Weighing of Evidence , 1950 .

[21]  Ramesh Kumar T. Reddy,et al.  Computer applications in resource estimation, prediction and assessment for metals and petroleum , 1992 .

[22]  K. Gerald van den Boogaart,et al.  Comment on “A Conditional Dependence Adjusted Weights of Evidence Model” by Minfeng Deng in Natural Resources Research 18(2009), 249–258 , 2011 .

[23]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[24]  H. Schaeben Comparison of Mathematical Methods of Potential Modeling , 2011, Mathematical Geosciences.

[25]  Alexandre Boucher,et al.  Evaluating Information Redundancy Through the Tau Model , 2005 .

[26]  Marzuki Khalid,et al.  Development of a hybrid Artificial Neural Network - Naive Bayes classifier for binary classification problem of imbalanced datasets , 2011 .

[27]  H. Schaeben Potential modeling: conditional independence matters , 2014 .

[28]  D. Groves,et al.  Science of targeting: definition, strategies, targeting and performance measurement , 2008 .

[29]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[30]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[31]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[32]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[33]  S. Krishnan The Tau Model for Data Redundancy and Information Combination in Earth Sciences: Theory and Application , 2008 .

[34]  Andrew Skabar,et al.  MODELING THE SPATIAL DISTRIBUTION OF MINERAL DEPOSITS USING NEURAL NETWORKS , 2007 .

[35]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[36]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[37]  G. Bonham-Carter Geographic Information Systems for Geoscientists , 1996 .

[38]  F. Agterberg,et al.  Application of a Microcomputer-based Geographic Information System to Mineral-Potential Mapping , 1990 .

[39]  K. Shadan,et al.  Available online: , 2012 .

[40]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[41]  E. I. Polyakova,et al.  The Nu Expression for Probabilistic Data Integration , 2007 .