Data Mining and Knowledge Discovery

This chapter attempts a concise introduction to data mining and knowledge discovery. First, we introduce the necessary nomenclature and definitions, discuss the background of the area, and elaborate on the technologies constituting the core part of knowledge discovery. Then we discuss several representative examples of knowledge discovery systems.

[1]  J. Ross Quinlan,et al.  Unknown Attribute Values in Induction , 1989, ML.

[2]  Gregory Piatetsky-Shapiro,et al.  Knowledge discovery workbench for exploring business databases , 1992, Int. J. Intell. Syst..

[3]  Steven L. Salzberg,et al.  Learning with Nested Generalized Exemplars , 1990 .

[4]  Michael Stonebraker,et al.  Database systems: achievements and opportunities , 1990, SGMD.

[5]  Andrew K. C. Wong,et al.  Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jerzy W. Grzymala-Busse,et al.  Comparison of Machine Learning and Knowledge Acquisition Methods of Rule Induction Based on Rough Sets , 1993, RSKD.

[7]  Ron Kohavi,et al.  Useful Feature Subsets and Rough Set Reducts , 1994 .

[8]  Usama M. Fayyad,et al.  The Attribute Selection Problem in Decision Tree Generation , 1992, AAAI.

[9]  Tomasz Imielinski,et al.  Research Directions in Knowledge Discovery , 1991, SIGMOD Rec..

[10]  Vijay V. Raghavan,et al.  Exploiting Upper Approximation in the Rough Set Methodology , 1995, KDD.

[11]  Harris Drucker,et al.  Capacity and Complexity Control in Predicting the Spread Between Borrowing and Lending Interest Rates , 1995, KDD.

[12]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[13]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[14]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[15]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[16]  Ray R. Hashemi,et al.  IQ Estimation of Monkeys Based on Human Data Using Rough Sets , 1994 .

[17]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[18]  Keinosuke Fukunaga,et al.  Effects of Sample Size in Classifier Design , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Jiawei Han,et al.  Knowledge Discovery in Databases: An Attribute-Oriented Approach , 1992, VLDB.

[20]  Alan J. Miller,et al.  Subset Selection in Regression , 1991 .

[21]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[22]  W. Scott Spangler,et al.  Learning Useful Rules from Inconclusive Data , 1991, Knowledge Discovery in Databases.

[23]  Daryl Pregibon,et al.  A statistical perspective on KDD , 1995, KDD 1995.

[24]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[25]  T. Anand,et al.  SPOTLIGHT: a data explanation system , 1992, Proceedings Eighth Conference on Artificial Intelligence for Applications.

[26]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[27]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[28]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[29]  Philip K. Chan,et al.  Systems for Knowledge Discovery in Databases , 1993, IEEE Trans. Knowl. Data Eng..

[30]  S. K. Wong,et al.  Comparison of the probabilistic approximate classification and the fuzzy set model , 1987 .

[31]  Roman Slowinski,et al.  Rough Classification of Patients After Highly Selective Vagotomy for Duodenal Ulcer , 1986, Int. J. Man Mach. Stud..

[32]  Chieng-Yi Chang Dynamic Programming as Applied to Feature Subset Selection in a Pattern Recognition System , 1973, IEEE Trans. Syst. Man Cybern..

[33]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[34]  Manabu Ichino,et al.  Generalized Minkowski metrics for mixed feature-type data analysis , 1994, IEEE Trans. Syst. Man Cybern..

[35]  W. H. Inmon,et al.  Building the data warehouse , 1992 .

[36]  Maciej Modrzejewski,et al.  Feature Selection Using Rough Sets Theory , 1993, ECML.

[37]  Jerzy W. Grzymala-Busse,et al.  On the Unknown Attribute Values in Learning from Examples , 1991, ISMIS.

[38]  Ning Zhong,et al.  Discovering Concept Clusters by Decomposing Databases , 1994, Data Knowl. Eng..

[39]  Richard D. Hackathorn,et al.  Using the Data Warehouse , 1994 .

[40]  Willi Klösgen,et al.  A Support System for Interpreting Statistical Data , 1991, Knowledge Discovery in Databases.

[41]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[42]  Yiyu Yao,et al.  A Decision Theoretic Framework for Approximating Concepts , 1992, Int. J. Man Mach. Stud..

[43]  Tomasz Imielinski,et al.  An Interval Classifier for Database Mining Applications , 1992, VLDB.

[44]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[45]  Paul W. Baim A Method for Attribute Selection in Inductive Learning Systems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Suk Kyoon Lee,et al.  An Extended Relational Database Model for Uncertain and Imprecise Information , 1992, VLDB.

[47]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[48]  Bo Thiesson,et al.  Accelerated Quantification of Bayesian Networks with Incomplete Data , 1995, KDD.

[49]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[50]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[51]  Vijay V. Raghavan,et al.  A System Architecture for Database Mining Applications , 1993, RSKD.

[52]  Mike James,et al.  Classification Algorithms , 1986, Encyclopedia of Machine Learning and Data Mining.

[53]  Zdzislaw Pawlak,et al.  Rough classification , 1984, Int. J. Hum. Comput. Stud..

[54]  S. K. Michael Wong,et al.  Comparison of Rough-Set and Statistical Methods in Inductive Learning , 1986, Int. J. Man Mach. Stud..