3DM: Domain-oriented Data-driven Data Mining

Recent developments in computing, communications, digital storage technologies, and high-throughput data-acquisition technologies, make it possible to gather and store incredible volumes of data. It creates unprecedented opportunities for knowledge discovery large-scale database. Data mining technology is a useful tool for this task. It is an emerging area of computational intelligence that offers new theories, techniques, and tools for processing large volumes of data, such as data analysis, decision making, etc. There are countless researchers working on designing efficient data mining techniques, methods, and algorithms. Unfortunately,most data mining researchers pay much attention to technique problems for developing data mining models and methods, while little to basic issues of data mining. What is data mining? What is the product of a data mining process? What are we doing in a data mining process? What is the rule we would obey in a data mining process? What is the relationship between the prior knowledge of domain experts and the knowledgemind from data? In this paper, we will address these basic issues of data mining from the viewpoint of informatics [1]. Data is taken as a manmade format for encoding knowledge about the natural world. We take data mining as a process of knowledge transformation. A domain-oriented data-driven data mining (3DM) model based on a conceptual data mining model is proposed. Some data-driven data mining algorithms are also proposed to show the validity of this model, e.g., the data-driven default rule generation algorithm, data-driven decision tree pre-pruning algorithm and data-driven knowledge acquisition from concept lattice.

[1]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[2]  Yingxu Wang,et al.  On Cognitive Informatics , 2002, Proceedings First IEEE International Conference on Cognitive Informatics.

[3]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[4]  Claudio Carpineto,et al.  GALOIS: An Order-Theoretic Approach to Conceptual Clustering , 1993, ICML.

[5]  Witold Pedrycz,et al.  User-Driven Fuzzy Clustering: On the Road to Semantic Classification , 2005, RSFDGrC.

[6]  Andrzej Skowron,et al.  A Rough Set Framework for Data Mining of Propositional Default Rules , 1996, ISMIS.

[7]  Wang Guo-yin,et al.  A Self-Learning Model under Uncertain Condition , 2003 .

[8]  Max Bramer,et al.  Pre-pruning Classification Trees to Reduce Overfitting in Noisy Domains , 2002, IDEAL.

[9]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[10]  Tapio Elomaa,et al.  An Analysis of Reduced Error Pruning , 2001, J. Artif. Intell. Res..

[11]  Chengqi Zhang,et al.  Domain-driven in-depth pattern discovery: A practical methodology , 2005 .

[12]  Yiyu Yao,et al.  Interactive classification using a granule network , 2005, Fourth IEEE Conference on Cognitive Informatics, 2005. (ICCI 2005)..

[13]  Guoyin Wang,et al.  Initiative learning algorithm based on rough set , 2003, SPIE Defense + Commercial Sensing.

[14]  Mehran Sahami Learning Classification Rules Using Lattices (Extended Abstract) , 1995, ECML.

[15]  Laks V. S. Lakshmanan,et al.  Constraint-Based Multidimensional Data Mining , 1999, Computer.

[16]  E. Mephu-Nguifo Galois Lattice: a framework for concept learning. Design, evaluation and refinement , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[17]  Terry Windeatt,et al.  Tree pruning for output coded ensembles , 2002, Object recognition supported by user interaction for service robots.

[18]  Donald Michie,et al.  Expert systems in the micro-electronic age , 1979 .

[19]  F. Liu,et al.  Generating rules and reasoning under inconsistencies , 2000, 2000 26th Annual Conference of the IEEE Industrial Electronics Society. IECON 2000. 2000 IEEE International Conference on Industrial Electronics, Control and Instrumentation. 21st Century Technologies.

[20]  Vasudha Bhatnagar,et al.  Incremental Classification Rules Based on Association Rules Using Formal Concept Analysis , 2005, MLDM.

[21]  Petra Perner,et al.  Machine Learning and Data Mining in Pattern Recognition , 2009, Lecture Notes in Computer Science.

[22]  Setsuo Ohsuga Knowledge Discovery as Translation , 2005, Foundations of Data Mining and knowledge Discovery.

[23]  Yiyu Yao,et al.  A Three-layered Conceptual Framework of Data Mining , 2004 .

[24]  Engelbert Mephu Nguifo,et al.  A Comparative Study of FCA-Based Supervised Classification Algorithms , 2004, ICFCA.

[25]  Chengqi Zhang,et al.  Domain-Driven Data Mining: Methodologies and Applications , 2006, AMT.

[26]  Fabrice Guillet,et al.  A User-Driven Process for Mining Association Rules , 2000, PKDD.

[27]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[28]  Guoyin Wang,et al.  Research on System Uncertainty Measures Based on Rough Set Theory , 2006, RSKT.

[29]  Mehran Sahami,et al.  Learning Classification Rules Using Lattices , 1995 .

[30]  Zhengxin Chen,et al.  A Systemic Framework for the Field of Data Mining and Knowledge Discovery , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[31]  Yingxu Wang On Cognitive Informatics , 2003 .

[32]  Michele Zappavigna,et al.  User Driven Example-Based Training for Creating Lexical Knowledgebases , 2002 .