Analogy-Based Reasoning in Classifier Construction

Analogy-based reasoning methods in machine learning make it possible to reason about properties of objects on the basis of similarities between objects. A specific similarity based method is the k nearest neighbors (k-nn) classification algorithm. In the k-nn algorithm, a decision about a new object x is inferred on the basis of a fixed number k of the objects most similar to x in a given set of examples. The primary contribution of the dissertation is the introduction of two new classification models based on the k-nn algorithm. The first model is a hybrid combination of the k-nn algorithm with rule induction. The proposed combination uses minimal consistent rules defined by local reducts of a set of examples. To make this combination possible the model of minimal consistent rules is generalized to a metric-dependent form. An effective polynomial algorithm implementing the classification model based on minimal consistent rules has been proposed by Bazan. We modify this algorithm in such a way that after addition of the modified algorithm to the k-nn algorithm the increase of the computation time is inconsiderable. For some tested classification problems the combined model was significantly more accurate than the classical k-nn classification algorithm. For many real-life problems it is impossible to induce relevant global mathematical models from available sets of examples. The second model proposed in the dissertation is a method for dealing with such sets based on locally induced metrics. This method adapts the notion of similarity to the properties of a given test object. It makes it possible to select the correct decision in specific fragments of the space of objects. The method with local metrics improved significantly the classification accuracy of methods with global models in the hardest tested problems. The important issues of quality and efficiency of the k-nn based methods are a similarity measure and the performance time in searching for the most similar objects in a given set of examples, respectively. In this dissertation both issues are studied in detail and some significant improvements are proposed for the similarity measures and for the search methods found in the literature.

[1]  Tony R. Martinez,et al.  An Integrated Instance‐Based Learning Algorithm , 2000, Comput. Intell..

[2]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[3]  Sadaaki Miyamoto,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[4]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[5]  Thomas G. Dietterich,et al.  A study of distance-based machine learning algorithms , 1994 .

[6]  Andrzej Skowron,et al.  Information Granules and Rough-Neural Computing , 2004 .

[7]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[8]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[9]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[10]  C. Papadimitriou,et al.  Segmentation problems , 2004 .

[11]  Jan G. Bazan Discovery of Decision Rules by Matching New Objects Against Data Tables , 1998, Rough Sets and Current Trends in Computing.

[12]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[13]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[14]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  H. Buchner The Grid File : An Adaptable , Symmetric Multikey File Structure , 2001 .

[16]  Arkadiusz Wojna,et al.  Local Attribute Value Grouping for Lazy Rule Induction , 2002, Rough Sets and Current Trends in Computing.

[17]  Shirley Dex,et al.  JR 旅客販売総合システム(マルス)における運用及び管理について , 1991 .

[18]  Marcin S. Szczuka,et al.  The Rough Set Exploration System , 2005, Trans. Rough Sets.

[19]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[20]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[21]  S. Salzberg,et al.  A weighted nearest neighbor algorithm for learning with symbolic features , 2004, Machine Learning.

[22]  S. Salzberg A nearest hyperrectangle learning method , 2004, Machine Learning.

[23]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[24]  J. L. Cowan Purpose and Teleology , 1968 .

[25]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[26]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[27]  Luc De Raedt,et al.  Machine Learning: ECML-94 , 1994, Lecture Notes in Computer Science.

[28]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[29]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[30]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[31]  Tsau Young Lin,et al.  Rough Sets and Data Mining: Analysis of Imprecise Data , 1996 .

[32]  Sturart J. Russell,et al.  The use of knowledge in analogy and induction , 1989 .

[33]  Ronald Aylmer Sir Fisher 043: Applications of "Student's" Distribution. , .

[34]  Jerome H. Friedman,et al.  Flexible Metric Nearest Neighbor Classification , 1994 .

[35]  Pedro M. Domingos Unifying Instance-Based and Rule-Based Induction , 1996, Machine Learning.

[36]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[37]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[38]  Kotagiri Ramamohanarao,et al.  DeEPs: A New Instance-Based Lazy Discovery and Classification System , 2004, Machine Learning.

[39]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[40]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[41]  Kotagiri Ramamohanarao,et al.  Combining the Strength of Pattern Frequency and Distance for Classification , 2001, PAKDD.

[42]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[43]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.

[44]  Arkadiusz Wojna,et al.  Center-Based Indexing in Vector and Metric Spaces , 2002, Fundam. Informaticae.

[45]  David Leake,et al.  Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[46]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[47]  Tapio Elomaa,et al.  Machine Learning: ECML 2002 , 2002, Lecture Notes in Computer Science.

[48]  Arkadiusz Wojna,et al.  On the Evolution of Rough Set Exploration System , 2004, Rough Sets and Current Trends in Computing.

[49]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[50]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[51]  Arkadiusz Wojna,et al.  RIONA: A New Classification System Combining Rule Induction and Instance-Based Learning , 2002, Fundam. Informaticae.

[52]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[53]  Andrzej Skowron,et al.  Synthesis of Decision Systems from Data Tables , 1997 .

[54]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[55]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[56]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[57]  Dimitrios Gunopulos,et al.  Efficient Local Flexible Nearest Neighbor Classification , 2002, SDM.

[58]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[59]  Christos Faloutsos,et al.  The TV-tree: An index structure for high-dimensional data , 1994, The VLDB Journal.

[60]  Sergio M. Savaresi,et al.  On the performance of bisecting K-means and PDDP , 2001, SDM.

[61]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[62]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[63]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[64]  Iraj Kalantari,et al.  A Data Structure and an Algorithm for the Nearest Point Problem , 1983, IEEE Transactions on Software Engineering.

[65]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[66]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[67]  M. Mead,et al.  Cybernetics , 1953, The Yale Journal of Biology and Medicine.

[68]  Jan M. Zytkow,et al.  Handbook of Data Mining and Knowledge Discovery , 2002 .

[69]  S. Griffis EDITOR , 1997, Journal of Navigation.

[70]  R. Shepard,et al.  Toward a universal law of generalization for psychological science. , 1987, Science.

[71]  Andrzej Skowron,et al.  K Nearest Neighbor Classification with Local Induction of the Simple Value Difference Metric , 2004, Rough Sets and Current Trends in Computing.

[72]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[73]  Walter Daelemans,et al.  An Empirical Re-Examination of Weighted Voting for k-NN , 1997 .

[74]  Ron Kohavi,et al.  Lazy Decision Trees , 1996, AAAI/IAAI, Vol. 1.

[75]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[76]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[77]  Andrew Luk,et al.  A Re-Examination of the Distance-Weighted k-Nearest Neighbor Classification Rule , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[78]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[79]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[80]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[81]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[82]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[83]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[84]  Jerzy W. Grzymala-Busse,et al.  LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[85]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[86]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[87]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[88]  David W. Aha,et al.  The omnipresence of case-based reasoning in science and application , 1998, Knowl. Based Syst..

[89]  Arkadiusz Wojna,et al.  RIONA: A Classifier Combining Rule Induction and k-NN Method with Automated Selection of Optimal Neighbourhood , 2002, ECML.

[90]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[91]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[92]  David H. Wolpert,et al.  Constructing a generalizer superior to NETtalk via a mathematical theory of generalization , 1990, Neural Networks.

[93]  Manuela M. Veloso,et al.  Planning and Learning by Analogical Reasoning , 1994, Lecture Notes in Computer Science.

[94]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[95]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[96]  N. Wiener,et al.  Behavior, Purpose and Teleology , 1943, Philosophy of Science.

[97]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[98]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[99]  Jakub Wroblewski,et al.  Covering with Reducts - A Fast Algorithm for Rule Generation , 1998, Rough Sets and Current Trends in Computing.

[100]  Steven Vajda,et al.  Games and Decisions. By R. Duncan Luce and Howard Raiffa. Pp. xi, 509. 70s. 1957. (J Wiley & Sons) , 1959, The Mathematical Gazette.

[101]  Andrzej Skowron,et al.  Rough-Neural Computing: Techniques for Computing with Words , 2004, Cognitive Technologies.

[102]  Paul S. Rosenbloom,et al.  Improving Accuracy by Combining Rule-Based and Case-Based Reasoning , 1996, Artif. Intell..

[103]  Arkadiusz Wojna,et al.  Center-based indexing for nearest neighbors search , 2003, Third IEEE International Conference on Data Mining.

[104]  Marcin S. Szczuka,et al.  RSES and RSESlib - A Collection of Tools for Rough Set Computations , 2000, Rough Sets and Current Trends in Computing.

[105]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[106]  Yoram Biberman,et al.  A Context Similarity Measure , 1994, ECML.