A Novel Method of the Incomplete Web Data Recovery

In order to recover the incomplete data on the WEB page, the paper proposed a new approach based on rough set to reduce the redundant attributes, discretize the continuous attributes and fill up the incomplete data. According to indiscernible relationship, discernible vector were defined and used the discernible vector addition rule to reduce attributes. And then, depending on the concept of super-club data and entropy of the information table, discretization of the continuous attributes was implemented. Finally, by use of the corresponding relationship of condition attributes and decision attributes, the definition of interval value and interval value addition rule were defined and filled up the incomplete data. The illustration and experimental results indicate that the approach is effective and efficient.

[1]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[2]  Xu E,et al.  A new algorithm for combining the local and global discretization methods , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[3]  Marzena Kryszkiewicz,et al.  Rough Set Approach to Incomplete Information Systems , 1998, Inf. Sci..

[4]  Hung Son Nguyen,et al.  Discretization Problem for Rough Sets Methods , 1998, Rough Sets and Current Trends in Computing.

[5]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[6]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .