Lazy Learning Algorithms for Problems with Many Binary Features and Classes

We have designed several new lazy learning algorithms for learning problems with many binary features and classes. This particular type of learning task can be found in many machine learning applications but is of special importance for machine learning of natural language. Besides pure instance-based learning we also consider prototype-based learning, which has the big advantage of a large reduction of the required memory and processing time for classification. As an application for our learning algorithms we have chosen natural language database interfaces. In our interface architecture the machine learning module replaces an elaborate semantic analysis component. The learning task is to select the correct command class based on semantic features extracted from the user input. We use an existing German natural language interface to a production planning and control system as a case study for our evaluation and compare the results achieved by the different lazy learning algorithms.

[1]  Werner Winiwarter,et al.  A Machine Learning Workbench in a DOOD Framework , 1997, DEXA.

[2]  Werner Winiwarter,et al.  A Comparative Study of the Application of Different Learning Techniques to Natural Language Interfaces , 1997, CoNLL.

[3]  Kuniaki Uehara,et al.  PBL: Prototype-Based Learning Algorithms , 1993, EWCBR.

[4]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[5]  A. Tversky Features of Similarity , 1977 .

[6]  Norman W. Paton,et al.  An Effective Deductive Object-Oriented Database Through Language Integration , 1994, VLDB.

[7]  Ellen Riloff,et al.  Information extraction as a basis for high-precision text classification , 1994, TOIS.

[8]  Werner Winiwarter,et al.  Unknown Value Lists and Their Use for Semantic Analysis in IDA - the Integrated Deductive Approach to Natural Language Interface Design , 1996, Australasian Database Conference.

[9]  Peter Thanisch,et al.  Natural language interfaces to databases – an introduction , 1995, Natural Language Engineering.

[10]  Werner Winiwarter,et al.  MIDAS - The Morphological Component of the IDA System for Efficient Natural Language Interface Design , 1995, DEXA.

[11]  RiloffEllen,et al.  Information extraction as a basis for high-precision text classification , 1994 .

[12]  Walter Daelemans,et al.  Generalization performance of backpropagation learning on a syllabification task , 1992 .

[13]  Walter Daelemans,et al.  Memory-Based Learning: Using Similarity for Smoothing , 1997, ACL.

[14]  Dennis F. Kibler,et al.  Learning Prototypical Concept Descriptions , 1995, ICML.

[15]  Stefan Wess,et al.  Topics in Case-Based Reasoning , 1994 .

[16]  Werner Winiwarter,et al.  THE INTEGRATED DEDUCTIVE APPROACH TO NATURAL LANGUAGE INTERFACES , 1994 .