Data mining: concepts and techniques by Jiawei Han and Micheline Kamber

Mining information from data: A presentday gold rush. Data Mining is a multidisciplinary field which supports knowledge workers who try to extract information in our “data rich, information poor” environment. Its name stems from the idea of mining knowledge from large amounts of data. The tools it provides assist us in the discovery of relevant information through a wide range of data analysis techniques. Any method used to extract patterns from a given data source is considered to be a data mining technique. Han and Kamber’s book provides more than a good starting point for those interested in this eclectic research field. The book surveys techniques for the main tasks data miners have to perform. Most existing data mining texts emphasize the managerial and marketing aspects involved in the adoption of this technology by modern enterprises. In contrast, Han and Kamber’s textbook focuses on issues such as algorithmic efficiency and scalability from a database perspective.

[1]  Sung-Hyon Myaeng,et al.  A practical hypertext catergorization method using links and incrementally available class information , 2000, SIGIR '00.

[2]  Heikki Mannila,et al.  The power of sampling in knowledge discovery , 1994, PODS '94.

[3]  Cheng Yang,et al.  Efficient discovery of error-tolerant frequent itemsets in high dimensions , 2001, KDD '01.

[4]  Jorma Rissanen,et al.  MDL-Based Decision Tree Pruning , 1995, KDD.

[5]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[6]  Ron Kohavi,et al.  Mining e-commerce data: the good, the bad, and the ugly , 2001, KDD '01.

[7]  Heikki Mannila,et al.  Theoretical frameworks for data mining , 2000, SKDD.

[8]  Christopher Dean,et al.  Quakefinder: A Scalable Data Mining System for Detecting Earthquakes from Space , 1996, KDD.

[9]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Pat Langley,et al.  Static Versus Dynamic Sampling for Data Mining , 1996, KDD.

[11]  Jiawei Han,et al.  Mining recurrent items in multimedia with progressive resolution refinement , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[12]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[13]  Jack Sklansky,et al.  On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..

[14]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[15]  Kenneth A. Ross,et al.  Fast Computation of Sparse Datacubes , 1997, VLDB.

[16]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[17]  Jude W. Shavlik,et al.  Extracting refined rules from knowledge-based neural networks , 2004, Machine Learning.

[18]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[19]  Oren Etzioni,et al.  Adaptive Web Sites: Conceptual Cluster Mining , 1999, IJCAI.

[20]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[21]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[22]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[23]  Srinivasan Parthasarathy,et al.  Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.

[24]  Bernard Widrow,et al.  Neural networks: applications in industry, business and science , 1994, CACM.

[25]  Jiawei Han,et al.  Data-Driven Discovery of Quantitative Rules in Relational Databases , 1993, IEEE Trans. Knowl. Data Eng..

[26]  Michel Manago,et al.  Induction of Decision Trees from Complex Structured Data , 1991, Knowledge Discovery in Databases.

[27]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[28]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[29]  Jian Pei,et al.  Can we push more constraints into frequent pattern mining? , 2000, KDD '00.

[30]  Thomas C. Redman,et al.  Data Quality: The Field Guide , 2001 .

[31]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[32]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[33]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[34]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[35]  Jiong Yang,et al.  CLUSEQ: efficient and effective sequence clustering , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[36]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[37]  D. Krane,et al.  Fundamental Concepts of Bioinformatics , 2002 .

[38]  Peter L. Brooks,et al.  Visualizing data , 1997 .

[39]  John A. Major,et al.  Selecting among rules induced from a hurricane database , 1993, Journal of Intelligent Information Systems.

[40]  Ashish Gupta,et al.  Materialized views: techniques, implementations, and applications , 1999 .

[41]  Christos Faloutsos,et al.  Online data mining for co-evolving time sequences , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[42]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[43]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[44]  George Kollios,et al.  Mining, indexing, and querying historical spatiotemporal data , 2004, KDD.

[45]  G. Reinsel,et al.  Introduction to Mathematical Statistics (4th ed.). , 1980 .

[46]  Bei Yu,et al.  A cross-collection mixture model for comparative text mining , 2004, KDD.

[47]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[48]  Jiawei Han,et al.  Efficient Polygon Amalgamation Methods for Spatial OLAP and Spatial Data Mining , 1999, SSD.

[49]  Agnès Voisard,et al.  Spatial Databases: With Application to GIS , 2001 .

[50]  Goetz Graefe,et al.  Multi-table joins through bitmapped join indices , 1995, SGMD.

[51]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[52]  Bernhard Schölkopf,et al.  Shrinking the Tube: A New Support Vector Regression Algorithm , 1998, NIPS.

[53]  Anthony K. H. Tung,et al.  Carpenter: finding closed patterns in long biological datasets , 2003, KDD '03.

[54]  Clement T. Yu,et al.  Priniples of Database Query Processing for Advanced Applications , 1997 .

[55]  Bharat Bhargava,et al.  Advanced Database Systems , 1993, Lecture Notes in Computer Science.

[56]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[57]  M. Pagano,et al.  Survival analysis. , 1996, Nutrition.

[58]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[59]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[60]  Wojciech Ziarko,et al.  The Discovery, Analysis, and Representation of Data Dependencies in Databases , 1991, Knowledge Discovery in Databases.

[61]  Philip S. Yu,et al.  Mining asynchronous periodic patterns in time series data , 2000, KDD '00.

[62]  Laks V. S. Lakshmanan,et al.  Optimization of constrained frequent set queries with 2-variable constraints , 1999, SIGMOD '99.

[63]  J. Ross Quinlan,et al.  Unknown Attribute Values in Induction , 1989, ML.

[64]  Heikki Mannila,et al.  Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[65]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[66]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[67]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[68]  Jiawei Han,et al.  Towards on-line analytical mining in large databases , 1998, SGMD.

[69]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[70]  I. Kononenko,et al.  Attribute Selection for Modeling , 1997 .

[71]  Simon Fraser MULTI-DIMENSIONAL SEQUENTIAL PATTERN MINING , 2001 .

[72]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[73]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[74]  Phyllis Koton,et al.  Reasoning about Evidence in Causal Explanations , 1988, AAAI.

[75]  Jaideep Srivastava,et al.  Web Mining — Concepts, Applications, and Research Directions , 2004 .

[76]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[77]  Edward Rolf Tufte,et al.  The visual display of quantitative information , 1985 .

[78]  Jiawei Han,et al.  Object-Based Selective Materialization for Efficient Implementation of Spatial Data Cubes , 2000, IEEE Trans. Knowl. Data Eng..

[79]  Kyuseok Shim,et al.  SPIRIT: Sequential Pattern Mining with Regular Expression Constraints , 1999, VLDB.

[80]  Ivan Bratko,et al.  Machine Learning and Data Mining; Methods and Applications , 1998 .

[81]  Giri Kumar Tayi,et al.  Enhancing data quality in data warehouse environments , 1999, CACM.

[82]  W. Scott Spangler,et al.  Learning Useful Rules from Inconclusive Data , 1991, Knowledge Discovery in Databases.

[83]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[84]  Jiawei Han,et al.  Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[85]  Hongjun Lu,et al.  On computing, storing and querying frequent patterns , 2003, KDD '03.

[86]  Veda C. Storey,et al.  A Framework for Analysis of Data Quality Research , 1995, IEEE Trans. Knowl. Data Eng..

[87]  Martin Stacey,et al.  Scientific Discovery: Computational Explorations of the Creative Processes , 1988 .

[88]  Jiawei Han,et al.  MultiMediaMiner: a system prototype for multimedia data mining , 1998, SIGMOD '98.

[89]  Wai Lam,et al.  Bayesian Network Refinement Via Machine Learning Approach , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[90]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[91]  George H. John Enhancements to the data mining process , 1997 .

[92]  R. Michalski,et al.  Learning from Observation: Conceptual Clustering , 1983 .

[93]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[94]  Rob Mattison,et al.  Data Warehousing and Data Mining for Telecommunications , 1997 .

[95]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[96]  Kotagiri Ramamohanarao,et al.  Making Use of the Most Expressive Jumping Emerging Patterns for Classification , 2001, Knowledge and Information Systems.

[97]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[98]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[99]  Jiawei Han,et al.  Selective Materialization: An Efficient Method for Spatial Data Cube Construction , 1998, PAKDD.

[100]  Jian Pei,et al.  Efficient computation of Iceberg cubes with complex measures , 2001, SIGMOD '01.

[101]  Hans-Peter Kriegel,et al.  VisDB: database exploration using multidimensional visualization , 1994, IEEE Computer Graphics and Applications.

[102]  Jeffrey C. Schlimmer Learning and Representation Change , 1987, AAAI.

[103]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[104]  John Mingers,et al.  An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.

[105]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[106]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[107]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[108]  Dimitris Meretakis,et al.  Extending naïve Bayes classifiers using long itemsets , 1999, KDD '99.

[109]  Sergio A. Alvarez,et al.  Efficient Adaptive-Support Association Rule Mining for Recommender Systems , 2004, Data Mining and Knowledge Discovery.

[110]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[111]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[112]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[113]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[114]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[115]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[116]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[117]  Shashi Shekhar,et al.  Spatial Databases - Accomplishments and Research Needs , 1999, IEEE Trans. Knowl. Data Eng..

[118]  Lawrence B. Holder,et al.  Substucture Discovery in the SUBDUE System , 1994, KDD Workshop.

[119]  Thorsten Joachims,et al.  A Statistical Learning Model of Text Classification for Support Vector Machines. , 2001, SIGIR 2002.

[120]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[121]  Mathias Kirsten,et al.  Relational Distance-Based Clustering , 1998, ILP.

[122]  C. J. V. Rijsbergen,et al.  Rough Sets, Fuzzy Sets and Knowledge Discovery , 1994, Workshops in Computing.

[123]  Joseph M. Hellerstein,et al.  Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.

[124]  Peter J. Haas,et al.  Interactive data Analysis: The Control Project , 1999, Computer.

[125]  S. Muthukrishnan,et al.  Mining Deviants in a Time Series Database , 1999, VLDB.

[126]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[127]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[128]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[129]  B. Gates Business @ the Speed of Thought , 1999 .

[130]  Ke Wang,et al.  Mining frequent item sets by opportunistic projection , 2002, KDD.

[131]  Jiawei Han,et al.  MM-Cubing: computing Iceberg cubes by factorizing the lattice space , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[132]  Eric R. Ziegel,et al.  An Introduction to Generalized Linear Models , 2002, Technometrics.

[133]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[134]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[135]  Avi Pfeffer,et al.  SPOOK: A system for probabilistic object-oriented knowledge representation , 1999, UAI.

[136]  Jiawei Han,et al.  CoMine: efficient mining of correlated patterns , 2003, Third IEEE International Conference on Data Mining.

[137]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[138]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[139]  Vipin Kumar,et al.  Scalable parallel data mining for association rules , 1997, SIGMOD '97.

[140]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[141]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[142]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[143]  Le Gruenwald,et al.  A survey of data mining and knowledge discovery software tools , 1999, SKDD.

[144]  Sunita Sarawagi,et al.  i3: Intelligent, Interactive Investigaton of OLAP data cubes , 2000, SIGMOD Conference.

[145]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[146]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[147]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[148]  I. Bratko,et al.  Learning decision rules in noisy domains , 1987 .

[149]  Mathias Kirsten,et al.  Extending K-Means Clustering to First-Order Representations , 2000, ILP.

[150]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[151]  Jennifer Widom,et al.  Clustering association rules , 1997, Proceedings 13th International Conference on Data Engineering.

[152]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[153]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[154]  Christos Faloutsos,et al.  Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining , 1998, VLDB.

[155]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[156]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[157]  Andrew W. Moore,et al.  Tractable group detection on large link data sets , 2003, Third IEEE International Conference on Data Mining.

[158]  Jiawei Han,et al.  Generalization-Based Data Mining in Object-Oriented Databases Using an Object Cube Model , 1998, Data Knowl. Eng..

[159]  Lotfi A. Zadeh,et al.  Commonsense Knowledge Representation Based on Fuzzy Logic , 1983, Computer.

[160]  Chen Wang,et al.  Scalable mining of large disk-based graph databases , 2004, KDD.

[161]  Leonid Khachiyan,et al.  Cubegrades: Generalizing Association Rules , 2002, Data Mining and Knowledge Discovery.

[162]  Ben Taskar,et al.  Probabilistic Classification and Clustering in Relational Data , 2001, IJCAI.

[163]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[164]  Kyuseok Shim,et al.  PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning , 1998, Data Mining and Knowledge Discovery.

[165]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[166]  Rajeev Motwani,et al.  Scalable Techniques for Mining Causal Structures , 1998, Data Mining and Knowledge Discovery.

[167]  Dimitrios Gunopulos,et al.  On-Line Discovery of Dense Areas in Spatio-temporal Databases , 2003, SSTD.

[168]  Jennifer Widom,et al.  Research problems in data warehousing , 1995, CIKM '95.

[169]  Jiawei Han,et al.  GeoMiner: a system prototype for spatial data mining , 1997, SIGMOD '97.

[170]  H.M. Wechsler,et al.  Digital image processing, 2nd ed. , 1981, Proceedings of the IEEE.

[171]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[172]  Heikki Mannila,et al.  Methods and Problems in Data Mining , 1997, ICDT.

[173]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[174]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[175]  David Konopnicki,et al.  W3QS: A Query System for the World-Wide Web , 1995, VLDB.

[176]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[177]  Sunita Sarawagi,et al.  Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications , 1998, SIGMOD '98.

[178]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[179]  Kristian G. Olesen,et al.  Practical Issues in Modeling Large Diagnostic Systems with Multiply Sectioned Bayesian Networks , 2000, Int. J. Pattern Recognit. Artif. Intell..

[180]  Edward Omiecinski,et al.  Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[181]  Tariq Samad,et al.  Designing Application-Specific Neural Networks Using the Genetic Algorithm , 1989, NIPS.

[182]  Willi Klösgen,et al.  A Support System for Interpreting Statistical Data , 1991, Knowledge Discovery in Databases.

[183]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[184]  Raymond J. Mooney,et al.  Content-boosted collaborative filtering for improved recommendations , 2002, AAAI/IAAI.

[185]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[186]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[187]  J. Snoeyink,et al.  Mining Spatial Motifs from Protein Structure Graphs , 2003 .

[188]  Jie Wu,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2003 .

[189]  Douglas H. Fisher,et al.  A Case Study of Incremental Concept Induction , 1986, AAAI.

[190]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[191]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[192]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[193]  David Loshin Enterprise knowledge management: the data quality approach , 2000 .

[194]  Jon M. Kleinberg,et al.  Applications of linear algebra in information retrieval and hypertext analysis , 1999, PODS '99.

[195]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[196]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[197]  David Heckerman,et al.  Bayesian Networks for Knowledge Discovery , 1996, Advances in Knowledge Discovery and Data Mining.

[198]  Jiawei Han,et al.  Generalization and decision tree induction: efficient classification in data mining , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[199]  Howard J. Hamilton,et al.  Knowledge discovery and measures of interest , 2001 .

[200]  Christopher K. Riesbeck,et al.  Inside Case-Based Reasoning , 1989 .

[201]  Renée J. Miller,et al.  Association rules over interval data , 1997, SIGMOD '97.

[202]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[203]  Stephen R. Gardner Building the data warehouse , 1998, CACM.

[204]  Paul S. Bradley,et al.  Compressed data cubes for OLAP aggregate query approximation on continuous dimensions , 1999, KDD '99.

[205]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[206]  Jiawei Han,et al.  DBMiner: A System for Mining Knowledge in Large Relational Databases , 1996, KDD.

[207]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[208]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[209]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[210]  Kyuseok Shim,et al.  WALRUS: a similarity retrieval algorithm for image databases , 1999, IEEE Transactions on Knowledge and Data Engineering.

[211]  Yasuhiko Morimoto,et al.  Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization , 1996, SIGMOD '96.

[212]  Mohammed J. Zaki Efficient enumeration of frequent sequences , 1998, CIKM '98.

[213]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[214]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[215]  Ehud Gudes,et al.  Computing frequent graph patterns from semistructured data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[216]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[217]  Hongjun Lu,et al.  NeuroRule: A Connectionist Approach to Data Mining , 1995, VLDB.

[218]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[219]  Yasuhiko Morimoto,et al.  Computing Optimized Rectilinear Regions for Association Rules , 1997, KDD.

[220]  Jiawei Han,et al.  Geographic Data Mining and Knowledge Discovery , 2001 .

[221]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[222]  Wei-Ying Ma,et al.  Locality preserving indexing for document representation , 2004, SIGIR '04.

[223]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[224]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[225]  J. Neter,et al.  Applied Linear Statistical Models (3rd ed.). , 1992 .

[226]  F. Ramsey,et al.  The statistical sleuth : a course in methods of data analysis , 2002 .

[227]  Thomas C. Redman,et al.  Data Quality Management and Technology , 1992 .

[228]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[229]  Jack E. Olson,et al.  Data Quality: The Accuracy Dimension , 2003 .

[230]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[231]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[232]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[233]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[234]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[235]  Ian H. Witten,et al.  Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .

[236]  Laks V. S. Lakshmanan,et al.  QC-trees: an efficient summary structure for semantic OLAP , 2003, SIGMOD '03.

[237]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[238]  Mohammed J. Zaki,et al.  PlanMine: Sequence Mining for Plan Failures , 1998, KDD.

[239]  Tong Zhang,et al.  Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[240]  George H. John Behind-the-scenes data mining: a report on the KDD-98 panel , 1999, SKDD.

[241]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[242]  C. Heckler Applied Discriminant Analysis , 1995 .

[243]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[244]  Andreas D. Baxevanis,et al.  Bioinformatics - a practical guide to the analysis of genes and proteins , 2001, Methods of biochemical analysis.

[245]  Kenneth A. Ross,et al.  Complex Aggregation at Multiple Granularities , 1998, EDBT.

[246]  R. Nakano,et al.  Medical diagnostic expert system based on PDP model , 1988, IEEE 1988 International Conference on Neural Networks.

[247]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[248]  Giulia Pagallo,et al.  Learning DNF by Decision Trees , 1989, IJCAI.

[249]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1997, International Journal on Digital Libraries.

[250]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[251]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[252]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[253]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[254]  Jiawei Han,et al.  An Efficient Two-Step Method for Classification of Spatial Data , 1998 .

[255]  Edward R. Tufte,et al.  Envisioning Information , 1990 .

[256]  Myke Gluck,et al.  Visual Explanations: Images and Quantities, Evidence and Narrative , 1997, Inf. Process. Manag..

[257]  Hans-Peter Kriegel,et al.  Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.

[258]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[259]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[260]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[261]  Sunita Sarawagi,et al.  Intelligent Rollups in Multidimensional OLAP Data , 2001, VLDB.

[262]  Michael Stonebraker,et al.  DBMS Research at a Crossroads: The Vienna Update , 1993, VLDB.

[263]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[264]  Nimrod Megiddo,et al.  Discovery-Driven Exploration of OLAP Data Cubes , 1998, EDBT.

[265]  M. A. Wincek Applied Statistical Time Series Analysis , 1990 .

[266]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[267]  Joseph M. Hellerstein,et al.  Potters Wheel: An interactive framework for data cleaning , 2000 .

[268]  Todd L. Heberlein,et al.  Network intrusion detection , 1994, IEEE Network.

[269]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[270]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[271]  W. Loh,et al.  Tree-Structured Classification via Generalized Discriminant Analysis. , 1988 .

[272]  Isabelle Guyon,et al.  Discovering Informative Patterns and Data Cleaning , 1996, Advances in Knowledge Discovery and Data Mining.

[273]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[274]  John Scott Social Network Analysis , 1988 .

[275]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[276]  Raymond T. Ng,et al.  Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining , 1996, IEEE Trans. Knowl. Data Eng..

[277]  Jiawei Han,et al.  Meta-Rule-Guided Mining of Association Rules in Relational Databases , 1995, KDOOD/TDOOD.

[278]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[279]  Saul Greenberg,et al.  How people revisit web pages: empirical findings and implications for the design of history systems , 1997, Int. J. Hum. Comput. Stud..

[280]  Jiawei Han,et al.  Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration , 2003, Very Large Data Bases Conference.

[281]  Madhu Sudan,et al.  A statistical perspective on data mining , 1997, Future Gener. Comput. Syst..

[282]  RamakrishnanRaghu,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999 .

[283]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .

[284]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[285]  Anthony K. H. Tung,et al.  Constraint-based clustering in large databases , 2001, ICDT.

[286]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[287]  Kathryn B. Laskey,et al.  Network Fragments: Representing Knowledge for Constructing Probabilistic Models , 1997, UAI.

[288]  Donato Malerba,et al.  A Further Comparison of Simplification Methods for Decision-Tree Induction , 1995, AISTATS.

[289]  Michael S. Waterman,et al.  Introduction to Computational Biology: Maps, Sequences and Genomes , 1998 .

[290]  Dennis Shasha,et al.  High Performance Discovery In Time Series: Techniques And Case Studies (Monographs in Computer Science) , 2004 .

[291]  Tomasz Imielinski,et al.  MSQL: A Query Language for Database Mining , 1999, Data Mining and Knowledge Discovery.

[292]  Giuseppe Psaila,et al.  A New SQL-like Operator for Mining Association Rules , 1996, VLDB.

[293]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[294]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[295]  Eli Upfal,et al.  Stochastic models for the Web graph , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[296]  Rajeev Motwani,et al.  Computing Iceberg Queries Efficiently , 1998, VLDB.

[297]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[298]  John F. Roddick,et al.  An Updated Bibliography of Temporal, Spatial, and Spatio-temporal Data Mining Research , 2000, TSDM.

[299]  Laks V. S. Lakshmanan,et al.  Efficient mining of constrained correlated sets , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[300]  Jeffrey Scott Vitter,et al.  Data cube approximation and histograms via wavelets , 1998, CIKM '98.

[301]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[302]  John F. Roddick,et al.  On the impact of knowledge discovery and data mining , 2000 .

[303]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[304]  Raymond J. Mooney,et al.  Symbolic and neural learning algorithms: An experimental comparison , 1991, Machine Learning.

[305]  Jon M. Kleinberg,et al.  A Microeconomic View of Data Mining , 1998, Data Mining and Knowledge Discovery.

[306]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[307]  Carlo Zaniolo,et al.  Metaqueries for Data Mining , 1996, Advances in Knowledge Discovery and Data Mining.

[308]  Prabhakar Raghavan,et al.  Information retrieval algorithms: a survey , 1997, SODA '97.

[309]  Mong-Li Lee,et al.  Image Mining: Trends and Developments , 2002, Journal of Intelligent Information Systems.

[310]  Joseph L. Hellerstein,et al.  Mining partially periodic event patterns with unknown periods , 2001, Proceedings 17th International Conference on Data Engineering.

[311]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[312]  Jan-Ming Ho,et al.  Discovering informative content blocks from Web documents , 2002, KDD.

[313]  David M. Pennock,et al.  Statistical relational learning for document mining , 2003, Third IEEE International Conference on Data Mining.

[314]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[315]  Laks V. S. Lakshmanan,et al.  Quotient Cube: How to Summarize the Semantics of a Data Cube , 2002, VLDB.

[316]  Samuel Madden,et al.  Continuously adaptive continuous queries over streams , 2002, SIGMOD '02.

[317]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[318]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[319]  Wojciech Szpankowski,et al.  An efficient algorithm for detecting frequent subgraphs in biological networks , 2004, ISMB/ECCB.

[320]  Shashi Shekhar,et al.  Spatial Databases: A Tour , 2003 .

[321]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[322]  Sunita Sarawagi,et al.  Mining Surprising Patterns Using Temporal Description Length , 1998, VLDB.

[323]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[324]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[325]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[326]  David Zipser,et al.  Feature Discovery by Competive Learning , 1986, Cogn. Sci..

[327]  Joseph M. Hellerstein,et al.  An Interactive Framework for Data Cleaning and Transformation , 1999 .

[328]  Philip S. Yu,et al.  Clustering through decision tree construction , 2000, CIKM '00.

[329]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[330]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[331]  Joseph Revelli,et al.  The Image Processing Handbook, 4th Edition , 2003, J. Electronic Imaging.

[332]  Jennifer Neville,et al.  Learning relational probability trees , 2003, KDD '03.

[333]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[334]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[335]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[336]  Hongjun Lu,et al.  Condensed cube: an effective approach to reducing data cube size , 2002, Proceedings 18th International Conference on Data Engineering.

[337]  Jesus Mena,et al.  Investigative Data Mining for Security and Criminal Detection , 2002 .

[338]  Pat Langley,et al.  Elements of Machine Learning , 1995 .

[339]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[340]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom , 1998 .

[341]  Hans-Peter Kriegel,et al.  Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification , 1995, SSD.

[342]  Theodore Johnson,et al.  Exploratory Data Mining and Data Cleaning , 2003 .

[343]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[344]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[345]  Son K. Dao,et al.  Dealing with Semantic Heterogeneity by Generalization-Based Data Mining Techniques , 2007 .

[346]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[347]  Ke Wang,et al.  Building Hierarchical Classifiers Using Class Proximity , 1999, VLDB.

[348]  E. Vald Principles of human-computer collaboration for knowledge discovery in science , 1999 .

[349]  Y.-S. Shih,et al.  Families of splitting criteria for classification trees , 1999, Stat. Comput..

[350]  Ada Wai-Chee Fu,et al.  Finding Structure and Characteristics of Web Documents for Classification , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[351]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[352]  Tom M. Mitchell,et al.  Version Spaces: A Candidate Elimination Approach to Rule Learning , 1977, IJCAI.

[353]  Jiawei Han,et al.  Resource and Knowledge Discovery in Global Information Systems: A Preliminary Design and Experiment , 1995, KDD.

[354]  Umeshwar Dayal,et al.  A data-warehouse/OLAP framework for scalable telecommunication tandem traffic analysis , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[355]  Paul E. Utgoff,et al.  Decision Tree Induction Based on Efficient Tree Restructuring , 1997, Machine Learning.

[356]  Philip S. Yu,et al.  CrossMine: efficient classification across multiple database relations , 2004, Proceedings. 20th International Conference on Data Engineering.

[357]  E. Tufte,et al.  The visual display of quantitative information , 1984, The SAGE Encyclopedia of Research Design.

[358]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[359]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[360]  Jiawei Han,et al.  High-Dimensional OLAP: A Minimal Cubing Approach , 2004, VLDB.

[361]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[362]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[363]  Joan Feigenbaum,et al.  Factorization in Experiment Generation , 1986, AAAI.

[364]  Jörg Rech,et al.  Knowledge Discovery in Databases , 2001, Künstliche Intell..

[365]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[366]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[367]  Geoffrey A. Moore Crossing the chasm : marketing and selling high-tech products to mainstream customers , 1999 .

[368]  Ryszard S. Michalski,et al.  AQ15: Incremental Learning of Attribute-Based Descriptions from Examples: The Method and User's Guide , 1986 .

[369]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[370]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[371]  Jiawei Han,et al.  Exploration of the power of attribute-oriented induction in data mining , 1995, KDD 1995.

[372]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[373]  Benjamin Van Roy,et al.  Solving Data Mining Problems Through Pattern Recognition , 1997 .

[374]  David T. Jones,et al.  Bioinformatics: Genes, Proteins and Computers , 2007 .

[375]  Igor Kononenko,et al.  On Biases in Estimating Multi-Valued Attributes , 1995, IJCAI.

[376]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[377]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[378]  V. Carey,et al.  Mixed-Effects Models in S and S-Plus , 2001 .

[379]  Duncan J. Watts,et al.  Six Degrees: The Science of a Connected Age , 2003 .

[380]  David J. DeWitt,et al.  Equi-depth multidimensional histograms , 1988, SIGMOD '88.

[381]  Jiawei Han,et al.  Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes , 1997, KDD.

[382]  Divesh Srivastava,et al.  Answering Queries with Aggregation Using Views , 1996, VLDB.

[383]  Lawrence B. Holder,et al.  Knowledge discovery in molecular biology: Identifying structural regularities in proteins , 1999, Intell. Data Anal..

[384]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[385]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[386]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[387]  R. Higgins Analysis for Financial Management , 2004 .

[388]  Erik Thomsen,et al.  OLAP Solutions - Building Multidimensional Information Systems , 1997 .

[389]  Barbara Hubbard,et al.  The World According to Wavelets , 1996 .

[390]  Daniel S. Hirschberg,et al.  The Time Complexity of Decision Tree Induction , 1995 .

[391]  JOHANNES GEHRKE,et al.  RainForest—A Framework for Fast Decision Tree Construction of Large Datasets , 1998, Data Mining and Knowledge Discovery.

[392]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[393]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[394]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[395]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[396]  J. Ross Quinlan,et al.  An Empirical Comparison of Genetic and Decision-Tree Classifiers , 1988, ML.

[397]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[398]  Thomas G. Dietterich,et al.  Readings in Machine Learning , 1991 .

[399]  V. S. Subrahmanian Principles of Multimedia Database Systems , 1998 .

[400]  George Karypis,et al.  C HAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling , 1999 .

[401]  Anthony K. H. Tung,et al.  Spatial clustering in the presence of obstacles , 2001, Proceedings 17th International Conference on Data Engineering.

[402]  Michael Stonebraker,et al.  Database research: achievements and opportunities into the 1st century , 1996, SGMD.

[403]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[404]  Ronald R. Yager,et al.  Fuzzy sets, neural networks, and soft computing , 1994 .

[405]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[406]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[407]  Philip S. Yu,et al.  Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[408]  Wynne Hsu,et al.  Using General Impressions to Analyze Discovered Classification Rules , 1997, KDD.

[409]  Michael J. A. Berry,et al.  Mastering Data Mining: The Art and Science of Customer Relationship Management , 1999 .

[410]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[411]  Michael Stonebraker,et al.  Efficient organization of large multidimensional arrays , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[412]  Christos Faloutsos,et al.  Prediction and indexing of moving objects with unknown motion patterns , 2004, SIGMOD '04.

[413]  Shamkant B. Navathe,et al.  Mining for strong negative associations in a large database of customer transactions , 1998, Proceedings 14th International Conference on Data Engineering.

[414]  Jeffrey F. Naughton,et al.  An array-based algorithm for simultaneous multidimensional aggregates , 1997, SIGMOD '97.

[415]  Jiawei Han,et al.  Intelligent Query Answering by Knowledge Discovery Techniques , 1996, IEEE Trans. Knowl. Data Eng..

[416]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[417]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[418]  Jiawei Han,et al.  Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases , 1994, KDD Workshop.

[419]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[420]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[421]  Philip J. Stone,et al.  Experiments in induction , 1966 .

[422]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[423]  Qiang Yang,et al.  Plan Mining by Divide-and-Conquer , 1999, 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[424]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[425]  Arie Shoshani,et al.  OLAP and statistical databases: similarities and differences , 1997, PODS '97.

[426]  Stuart J. Russell,et al.  Local Learning in Probabilistic Networks with Hidden Variables , 1995, IJCAI.

[427]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[428]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[429]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[430]  Ryszard S. Michalski,et al.  A Theory and Methodology of Inductive Learning , 1983, Artificial Intelligence.

[431]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[432]  Dimitrios Gunopulos,et al.  Efficient Mining of Spatiotemporal Patterns , 2001, SSTD.

[433]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[434]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .