Application of Data Decomposition to Incomplete Information Systems

Many developed classification methods and knowledge discovery software, that were research subjects for years, suffer from the lack of possibility to handle data with missing attribute values. To adapt existing classification methods to incomplete information systems, we propose a decomposition method that allows more appropriate missing value attributes handling. The decomposition method consists of two phases. In the first step data from original decision table are partitioned into subsets. In the second step, knowledge from those subsets, that in our case is classification hypothesis, is combined to achieve a final classification based on a whole original decision table. There were carried out some experiments in order to evaluate the decomposition method.

[1]  Salvatore Greco,et al.  Rough Set Processing of Vague Information Using Fuzzy Similarity Relations , 2000, Finite Versus Infinite.

[2]  Max Bramer,et al.  Techniques for Dealing with Missing Values in Classification , 1997, IDA.

[3]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[4]  Nader H. Bshouty,et al.  On learning in the presence of unspecified attribute values , 1999, COLT '99.

[5]  Stephen Kwek,et al.  Learning from examples with unspecified attribute values (extended abstract) , 1997, COLT '97.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Stephen Kwek,et al.  Learning from examples with unspecified attribute values , 2003, Inf. Comput..

[8]  Andrzej Skowron,et al.  Discovery of Data Patterns with Applications to Decomposition and Classification Problems , 1998 .

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Andrzej Skowron,et al.  Rough Sets: A Tutorial , 1998 .

[11]  Andrzej Skowron,et al.  Boolean Reasoning for Decision Rules Generation , 1993, ISMIS.

[12]  Jerzy W. Grzymala-Busse,et al.  A Comparison of Several Approaches to Missing Attribute Values in Data Mining , 2000, Rough Sets and Current Trends in Computing.

[13]  Andrzej Skowron,et al.  EXTRACTING LAWS FROM DECISION TABLES: A ROUGH SET APPROACH , 1995, Comput. Intell..

[14]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[15]  J. Ross Quinlan,et al.  Unknown Attribute Values in Induction , 1989, ML.

[16]  Sholom M. Weiss,et al.  Decision-Rule Solutions for Data Mining with Missing Values , 2000, IBERAMIA-SBIA.