An Incomplete Data Analysis Approach Based on the Rough Set Theory and Divide-and-Conquer Idea

Data missing is inevitable in practical fields, how to analyze these incomplete data more efficient is important for data mining. Many methods such as statistical strategy are generally used, but all have some faults. The approach based on rough set theory is proved to be more excellent, but the existed method is still not perfect. This paper extends the valued tolerance relation in rough set theory, introduces divide-and-conquer idea, and accordingly proposes a new incomplete data analysis approach "RSDIDA". This approach more fully utilizes the potential knowledge and laws suggested by the data in information system, can give better completeness analysis to incomplete data, and enhance the efficiency greatly. Experimental result demonstrates its superiority, and it can be adopted as a pre-processing method in data mining.