Learning Data Classification: Classifiers in General and in Decision Systems

In this chapter, we recall basic facts about data classifiers in machine learning. Our survey is focused on Bayes and kNN classifiers which are employed in our experiments. Some basic facts form Computational Learning Theory are followed by an account of classifiers in real decision systems, mostly elaborated within Rough Set Theory.

[1]  Andrzej Skowron,et al.  On Irreducible Descriptive Sets of Attributes for Information Systems , 2008, RSCTC.

[2]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[3]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[4]  G. Leibniz,et al.  Philosophical papers and letters. , 2011 .

[5]  Vladik Kreinovich,et al.  Handbook of Granular Computing , 2008 .

[6]  Guoyin Wang,et al.  Solving the Attribute Reduction Problem with Ant Colony Optimization , 2011, Trans. Rough Sets.

[7]  Ryszard S. Michalski,et al.  Pattern Recognition as Rule-Guided Inductive Inference , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[9]  中澤 真,et al.  Devroye, L., Gyorfi, L. and Lugosi, G. : A Probabilistic Theory of Pattern Recognition, Springer (1996). , 1997 .

[10]  Jakub Wróblewski,et al.  Adaptive Aspects of Combining Approximation Spaces , 2004, Rough-Neural Computing: Techniques for Computing with Words.

[11]  Piotr Artiemjew,et al.  On Classification of Data by Means of Rough Mereological Granules of Objects and Rules , 2008, RSKT.

[12]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[13]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[14]  P. J. Clark,et al.  Distance to Nearest Neighbor as a Measure of Spatial Relationships in Populations , 1954 .

[15]  Andrzej Skowron,et al.  Boolean Reasoning for Decision Rules Generation , 1993, ISMIS.

[16]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[17]  Joseph Anthony Navarro,et al.  STUDIES IN STATISTICAL ECOLOGY , 1955 .

[18]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[19]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[20]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[21]  Tsau Young Lin,et al.  Rough Set Methods and Applications , 2000 .

[22]  C. J. Stone,et al.  Consistent Nonparametric Regression , 1977 .

[23]  Jerzy W. Grzymala-Busse,et al.  LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[24]  Z. Pawlak,et al.  Partial dependency of attributes , 1988 .

[25]  Sinh Hoa Nguyen,et al.  Regularity analysis and its applications in data mining , 2000 .

[26]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Lech Polkowski,et al.  Granulation of Knowledge: Similarity Based Approach in Information and Decision Systems , 2009, Encyclopedia of Complexity and Systems Science.

[28]  Lech Polkowski,et al.  Formal granular calculi based on rough inclusions , 2005, 2005 IEEE International Conference on Granular Computing.

[29]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[30]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[31]  Jerzy W. Grzymala-Busse,et al.  A Comparison of Several Approaches to Missing Attribute Values in Data Mining , 2000, Rough Sets and Current Trends in Computing.

[32]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[33]  G. W. Snedecor Statistical Methods , 1964 .

[34]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[35]  G. Leibniz Discourse on Metaphysics , 1902 .

[36]  Marzena Kryszkiewicz,et al.  Data mining in incomplete information systems from rough set perspective , 2000 .

[37]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[38]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[39]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[40]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Small Sample Performance , 1952 .

[41]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[42]  Roman Słowiński,et al.  Intelligent Decision Support , 1992, Theory and Decision Library.

[43]  Lech Polkowski,et al.  Rough Sets in Knowledge Discovery 2 , 1998 .

[44]  Frank M. Brown,et al.  Boolean reasoning - the logic of boolean equations , 1990 .

[45]  Lech Polkowski,et al.  Data-Mining and Knowledge Discovery: Case-Based Reasoning, Nearest Neighbor and Rough Sets , 2009, Encyclopedia of Complexity and Systems Science.

[46]  Lech Polkowski A Unified Approach to Granulation of Knowledge and Granular Computing Based on Rough Mereology: A Survey , 2008 .

[47]  R. Tibshirani,et al.  A bias correction for the minimum error rate in cross-validation , 2009, 0908.2904.

[48]  Lech Polkowski,et al.  On Granular Rough Computing with Missing Values , 2007, RSEISP.

[49]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[50]  Andrzej Skowron,et al.  Transactions on Rough Sets XI , 2010, Trans. Rough Sets.

[51]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[52]  Lech Polkowski,et al.  Granulation of Knowledge in Decision Systems: The Approach Based on Rough Inclusions. The Method and Its Applications , 2007, RSEISP.

[53]  Lech Polkowski,et al.  On the Idea of Using Granular Rough Mereological Structures in Classification of Data , 2008, RSKT.

[54]  Arkadiusz Wojna,et al.  Analogy-Based Reasoning in Classifier Construction , 2005, Trans. Rough Sets.

[55]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[56]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[57]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[58]  Jerzy W. Grzymala-Busse,et al.  Transactions on Rough Sets XIII , 2011, Lecture Notes in Computer Science.

[59]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[60]  Zdzisław Pawlak On rough dependency of attributes in information systems , 1985 .

[61]  Andrzej Skowron,et al.  Rough-Neural Computing , 2004, Cognitive Technologies.

[62]  S. Fiske,et al.  The Handbook of Social Psychology , 1935 .

[63]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[64]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[65]  Jan G. Bazan,et al.  Rough set algorithms in classification problem , 2000 .

[66]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[67]  Edward A. Patrick,et al.  A Generalized k-Nearest Neighbor Rule , 1970, Inf. Control..

[68]  B. Kintz,et al.  Computational Handbook of Statistics , 1968 .