Enhancing Attribute Oriented Induction Of Data Mining

Data summarization is a data mining technique to summarize huge data in few understandable knowledge. Attribute - Oriented Induction(AOI) is a data summarization algorithm, it suffer from overgeneralization problem. In this paper, we use an entropy measu re to enhance generalization process, feature selection, and stop condition. Experimental results show that the proposed technique will reduce the effect of overgeneralization problem.

[1]  Mehmed Kantardzic,et al.  Data Mining: Concepts, Models, Methods, and Algorithms , 2002 .

[2]  Marc Dacier,et al.  Mining intrusion detection alarms for actionable knowledge , 2002, KDD.

[3]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[4]  H SpitsWarnarsH.L. Attribute oriented induction with star schema , 2010, ArXiv.

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  S. Ramachandram,et al.  Decision Tree Induction: An Approach for Data Classification Using AVL-Tree , 2010 .

[7]  Frederick E. Petry,et al.  Discovery of Abstract Knowledge from Non-Atomic Attribute Values in Fuzzy Relational Databases , 2006 .

[8]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[9]  Jiawei Han,et al.  Efficient Rule-Based Attribute-Oriented Induction for Data Mining , 2000, Journal of Intelligent Information Systems.

[10]  Jiawei Han,et al.  Knowledge Discovery in Databases: An Attribute-Oriented Approach , 1992, VLDB.

[11]  Jiawei Han,et al.  Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases , 1994, KDD Workshop.

[12]  Christopher G. Healey,et al.  Summarization techniques for visualization of large, multidimensional datasets , 2005 .

[13]  Jiawei Han,et al.  Generalization and decision tree induction: efficient classification in data mining , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.