Generalization-Based Data Mining in Object-Oriented Databases Using an Object Cube Model

Data mining is the discovery of knowledge and useful information from the large amounts of data stored in databases. With the increasing popularity of object-oriented database systems in advanced database applications, it is important to study the data mining methods for object-oriented databases because mining knowledge from such databases may improve understanding, organization, and utilization of the data stored there. In this paper, issues on generalization-based data mining in object-oriented databases are investigated in three aspects: (1) generalization of complex objects, (2) class-based generalization, and (3) extraction of different kinds of rules. An object cube model is proposed for class-based generalization, on-line analytical processing, and data mining. The study shows that (i) a set of sophisticated generalization operators can be constructed for generalization of complex data objects, (ii) a dimension-based class generalization mechanism can be developed for object cube construction, and (iii) sophisticated rule formation methods can be developed for extraction of different kinds of knowledge from data, including characteristic rules, discriminant rules, association rules, and classification rules. Furthermore, the application of such discovered knowledge may substantially enhance the power and flexibility of browsing databases, organizing databases and querying data and knowledge in object-oriented databases.

[1]  Jiawei Han Knowledge Discovery in Object-Oriented and Active Databases , 1993 .

[2]  François Bancilhon,et al.  Building an Object-Oriented Database System, The Story of O2 , 1992 .

[3]  Elisa Bertino,et al.  Object-Oriented Database Systems , 1993 .

[4]  Stanley B. Zdonik,et al.  A query algebra for object-oriented databases , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[5]  R. Ng,et al.  Eecient and Eeective Clustering Methods for Spatial Data Mining , 1994 .

[6]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[7]  Michael Kifer,et al.  Querying object-oriented databases , 1992, SIGMOD '92.

[8]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[9]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[10]  Jiawei Han,et al.  Exploration of the power of attribute-oriented induction in data mining , 1995, KDD 1995.

[11]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[12]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[13]  Michael R. Genesereth,et al.  Logical foundations of artificial intelligence , 1987 .

[14]  Ramakrishnan Srikant,et al.  The Quest Data Mining System , 1996, KDD.

[15]  R. G. Cattell Object Data Management: Object-Oriented and Extended , 1994 .

[16]  James Kelly,et al.  AutoClass: A Bayesian Classification System , 1993, ML.

[17]  Jiawei Han,et al.  Data-Driven Discovery of Quantitative Rules in Relational Databases , 1993, IEEE Trans. Knowl. Data Eng..

[18]  Michel Manago,et al.  Induction of Decision Trees from Complex Structured Data , 1991, Knowledge Discovery in Databases.

[19]  Won Kim,et al.  Introduction to Object-Oriented Databases , 1991, Computer systems.

[20]  R. G. G. Cattell,et al.  Object Data Management: Object-Oriented and Extended Relational Database Systems (Revised Edition) , 1991 .

[21]  Usama M. Fayyad,et al.  Automating the Analysis and Cataloging of Sky Surveys , 1996, Advances in Knowledge Discovery and Data Mining.

[22]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[23]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[24]  Abraham Silberschatz,et al.  A Multi-Resolution Relational Data Model , 1992, VLDB.

[25]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[26]  Wesley W. Chu,et al.  Abstraction of High Level Concepts from Numerical Values in Databases , 1994, KDD Workshop.

[27]  Douglas Fisher Optimization and Simplification of Hierarchical Clusterings , 1995, KDD.

[28]  Gang Liu,et al.  DBMiner: a system for data mining in relational databases and data warehouses , 1997, CASCON.

[29]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[30]  Venky Harinarayan,et al.  Implementing Data Cubes E ciently , 1996 .

[31]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[32]  Usama M. Fayyad,et al.  Knowledge Discovery in Databases: An Overview , 1997, ILP.

[33]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[34]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[35]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[36]  Jeffrey F. Naughton,et al.  An array-based algorithm for simultaneous multidimensional aggregates , 1997, SIGMOD '97.

[37]  Douglas H. Fisher,et al.  Improving Inference through Conceptual Clustering , 1987, AAAI.

[38]  Jiawei Han,et al.  Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases , 1994, KDD Workshop.

[39]  Amihai Motro,et al.  Querying database knowledge , 1990, SIGMOD '90.

[40]  Hans-Peter Kriegel,et al.  Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification , 1995, SSD.

[41]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[42]  Andrew K. C. Wong,et al.  Statistical Technique for Extracting Classificatory Knowledge from Databases , 1991, Knowledge Discovery in Databases.

[43]  W. Scott Spangler,et al.  Learning Useful Rules from Inconclusive Data , 1991, Knowledge Discovery in Databases.

[44]  Carlo Zaniolo,et al.  Using Metagueries to Integrate Inductive Learning and Deductive Database Technology , 1994, KDD Workshop.

[45]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[46]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[47]  Brian R. Gaines,et al.  Knowledge acquisition for knowledge-based systems , 1991, IEEE Expert.

[48]  C. J. V. Rijsbergen,et al.  Rough Sets, Fuzzy Sets and Knowledge Discovery , 1994, Workshops in Computing.

[49]  Michael Kifer,et al.  Deductive and Object-Oriented Databases , 1991 .

[50]  横井 俊夫,et al.  Knowledge building and knowledge sharing , 1994 .

[51]  Ron Kohavi,et al.  MineSet: An Integrated System for Data Mining , 1997, KDD.

[52]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[53]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[54]  Jan M. Zytkow,et al.  Interactive Mining of Regularities in Databases , 1991, Knowledge Discovery in Databases.

[55]  R. Bone Discovery , 1938, Nature.

[56]  Jack A. Orenstein,et al.  Query processing in the ObjectStore database system , 1992, SIGMOD '92.

[57]  Heikki Mannila,et al.  The power of sampling in knowledge discovery , 1994, PODS '94.

[58]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[59]  Olivia R. Liu Sheng,et al.  An object-oriented methodology for knowledge base/database coupling , 1992, CACM.

[60]  Jiawei Han,et al.  Knowledge discovery in object-oriented databases: the first step , 1993 .