Learning Logic Formulas and Related Error Distributions

This chapter describes a method for learning logic formulas that correctly classify the records of a given data set consisting of two classes. The method derives from given training data certain minimum cost satisfiability problems, solves these problems, and deduces from the solutions the desired logic formulas. There are at least two ways in which the results may be employed. First, one may use the logic formulas directly as rules in application programs. Second, one may construct vote-based rules, where the formulas produce votes and where the votes are combined to a vote-total. The latter approach allows for assessment and even control of prediction errors, as follows: Once the method has produced the logic formulas, it computes from the training data estimated distributions for the vote-totals without use of any additional data. From these distributions the method estimates probabilities for prediction errors. That information supports assessment and control of errors. Uses of the method include datamining, knowledge acquisition in expert systems, and identification of critical characteristics for recognition systems. Computational tests indicate that the method is fast and effective.

[1]  Karl Branting,et al.  A computational model of ratio decidendi , 2004, Artificial Intelligence and Law.

[2]  Toshihide Ibaraki,et al.  An Implementation of Logical Analysis of Data , 2000, IEEE Trans. Knowl. Data Eng..

[3]  Toshihide Ibaraki,et al.  Positive and Horn Decomposability of Partially Defined Boolean Functions , 1997, Discret. Appl. Math..

[4]  F. Glover,et al.  Simple but powerful goal programming models for discriminant problems , 1981 .

[5]  William W. Cohen Pac-Learning Non-Recursive Prolog Clauses , 1995, Artif. Intell..

[6]  Evangelos Triantaphyllou,et al.  On the minimum number of logical clauses inferred from examples , 1996, Comput. Oper. Res..

[7]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[8]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[9]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[10]  Mauricio G. C. Resende,et al.  A continuous approach to inductive inference , 1992, Math. Program..

[11]  Soundar R. T. Kumara,et al.  Generating logical expressions from positive and negative examples via a branch-and-bound approach , 1994, Comput. Oper. Res..

[12]  Mostefa Golea Average case analysis of a learning algorithm for µ-DNF expressions , 1995, EuroCOLT.

[13]  Toshihide Ibaraki,et al.  Logical Analysis of Binary Data with Missing Bits , 1999, Artif. Intell..

[14]  Klaus Truemper,et al.  A MINSAT Approach for Learning in Logic Domains , 2002, INFORMS J. Comput..

[15]  Richard S. Johannes,et al.  Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus , 1988 .

[16]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[17]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[18]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[19]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[20]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[21]  Ned Freed,et al.  EVALUATING ALTERNATIVE LINEAR PROGRAMMING MODELS TO SOLVE THE TWO-GROUP DISCRIMINANT PROBLEM , 1986 .

[22]  William Nick Street,et al.  Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[23]  Y. Crama,et al.  Cause-effect relationships and partially defined Boolean functions , 1988 .

[24]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[25]  Klaus Truemper,et al.  Effective logic computation , 1998 .

[26]  Olvi L. Mangasarian,et al.  Mathematical Programming in Neural Networks , 1993, INFORMS J. Comput..

[27]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[28]  Chris Carter,et al.  Assessing Credit Card Applications Using Machine Learning , 1987, IEEE Expert.

[29]  O. Mangasarian,et al.  Pattern Recognition Via Linear Programming: Theory and Application to Medical Diagnosis , 1989 .

[30]  Raymond J. Mooney,et al.  Symbolic and Neural Learning Algorithms: An Experimental Comparison , 1991, Machine Learning.

[31]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[32]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[33]  J. Ross Quinlan,et al.  Combining Instance-Based and Model-Based Learning , 1993, ICML.

[34]  Pat Langley,et al.  Models of Incremental Concept Formation , 1990, Artif. Intell..

[35]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[36]  Stephen Muggleton,et al.  Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic , 1999, Artif. Intell..

[37]  Hemant K. Bhargava,et al.  Data Mining by Decomposition: Adaptive Search for Hypothesis Generation , 1999, INFORMS J. Comput..

[38]  Klaus Truemper,et al.  Application of a New Logic Domain Method for the Diagnosis of Hepatocellular Carcinoma , 2001, MedInfo.

[39]  Thomas F. Coleman,et al.  Large-Scale Numerical Optimization , 1990 .

[40]  V. Chandru,et al.  Optimization Methods for Logical Inference , 1999 .

[41]  Eytan Domany,et al.  Models of Neural Networks I , 1991 .

[42]  W. T. Illingworth,et al.  Practical guide to neural nets , 1991 .

[43]  Vijay Chandru,et al.  Optimization Methods for Logical Inference: Chandru/Optimization , 1999 .