KI-LEARN: Knowledge-Intensive Learning Methods for Knowledge-Rich/Data-Poor Domains

Abstract : Knowledge Representation and Reasoning (KRR) has developed a wide range of methods for representing knowledge and reasoning from it to produce expert-level performance. Despite these accomplishments, there is one major problem preventing the wide-spread application of KRR technology: the inability to support learning. This makes KRR systems brittle and difficult to maintain. On the other hand, Machine Learning (ML) has developed a wide range of methods for learning from examples. However, there are two major problems preventing the wide-spread application of machine learning technology: the need for large amounts of training data and the high cost of manually designing the hypothesis space of the learning system. Our goal in this research effort was to develop a new methodology, called KI-LEARN (Knowledge Intensive LEARNing), that combines domain knowledge and sparse training data to construct high-performance systems. This report provides an overview of the major results we obtained on specific tasks as outlined in our proposal. More specifically, to address issues in knowledge representation and efficient learning we designed a language called First-Order Conditional Influence (FOCI) Language for expressing attributes relevant to learning. Our language extends probabilistic relational models (PRMs) which are themselves probabilistic representations most similar to first-order representation languages employed in KRR systems. A distinct feature of our language is its support for explicit expression of qualitative constraints such as monotonicity, saturation, and synergies. More importantly, we have demonstrated via mathematical proofs and experimental results how these qualitative constraints can be used and exploited when learning with sparse training data. We specifically show how qualitative constraints can be incorporated into learning algorithms. In addition, this report describes the models we constructed for our testbed domains.

[1]  E. Lehmann Ordered Families of Distributions , 1955 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  D. Bobrow Qualitative Reasoning about Physical Systems , 1985 .

[4]  Editors , 1986, Brain Research Bulletin.

[5]  William H. Press,et al.  Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[6]  Richard S. Johannes,et al.  Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus , 1988 .

[7]  M. Bohanec,et al.  KNOWLEDGE ACQUISITION AND EXPLANATION FOR MULTI-ATTRIBUTE DECISION MAKING ∗ , 1988 .

[8]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[9]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[10]  A. John Mallinckrodt,et al.  Qualitative reasoning: Modeling and simulation with incomplete knowledge , 1994, at - Automatisierungstechnik.

[11]  Kenneth D. Forbus Qualitative Process Theory , 1984, Artificial Intelligence.

[12]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[13]  Michael P. Wellman Fundamental Concepts of Qualitative Probabilistic Networks , 1990, Artif. Intell..

[14]  Jorg-uwe Kietz,et al.  Controlling the Complexity of Learning in Logic through Syntactic and Task-Oriented Models , 1992 .

[15]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[16]  J. Ross Quinlan,et al.  Combining Instance-Based and Model-Based Learning , 1993, ICML.

[17]  Stan Matwin,et al.  Using Qualitative Models to Guide Inductive Learning , 1993, ICML.

[18]  Shouhong Wang,et al.  Application of the Back Propagation Neural Network Algorithm with Monotonicity Constraints for Two‐Group Classification Problems* , 1993 .

[19]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[20]  R. Szekli Stochastic Ordering and Dependence in Applied Probability , 1995 .

[21]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[22]  Ivan Bratko,et al.  Machine Learning by Function Decomposition , 1997, ICML.

[23]  Peter Haddawy,et al.  Answering Queries from Context-Sensitive Probabilistic Knowledge Bases (cid:3) , 1996 .

[24]  H. Daniels,et al.  Application of MLP Networks to Bond Rating and House Pricing , 1999, Neural Computing & Applications.

[25]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[26]  Luc De Raedt,et al.  Bayesian Logic Programs , 2001, ILP Work-in-progress reports.

[27]  A. J. Feelders Prior Knowledge in Economic Applications of Data Mining , 2000, PKDD.

[28]  Hennie Daniels,et al.  Integrating economic knowledge in data mining algorithms , 2001 .

[29]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[30]  A. J. Feelders,et al.  Classification trees for problems with monotonicity constraints , 2002, SKDD.

[31]  L. Ungar,et al.  Deriving Monotonic Function Envelopes from Observations , 2003 .

[32]  S. Griffis EDITOR , 1997, Journal of Navigation.

[33]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[34]  David Heckerman,et al.  Probabilistic Models for Relational Data , 2004 .

[35]  Stuart J. Russell,et al.  Adaptive Probabilistic Networks with Hidden Variables , 1997, Machine Learning.

[36]  A. Ben-David Monotonicity Maintenance in Information-Theoretic Machine Learning Algorithms , 1995, Machine Learning.

[37]  Linda C. van der Gaag,et al.  Monotonicity in Bayesian Networks , 2004, UAI.

[38]  Thomas G. Dietterich,et al.  TaskTracer: a desktop environment to support multi-tasking knowledge workers , 2005, IUI.

[39]  Thomas G. Dietterich,et al.  Learning from Sparse Data by Exploiting Monotonicity Constraints , 2005, UAI.

[40]  Eric E. Altendorf,et al.  First order conditional influence language , 2005 .

[41]  Thomas G. Dietterich,et al.  Learning first-order probabilistic models with combining rules , 2005, Annals of Mathematics and Artificial Intelligence.

[42]  Thomas G. Dietterich,et al.  Fewer clicks and less frustration: reducing the cost of reaching the right folder , 2006, IUI '06.

[43]  Thomas G. Dietterich,et al.  A hybrid learning system for recognizing user tasks from desktop activities and email messages , 2006, IUI '06.

[44]  Luc De Raedt,et al.  Basic Principles of Learning Bayesian Logic Programs , 2008, Probabilistic Inductive Logic Programming.