Data Mining: Concepts and Techniques

The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

[1]  A. R. Crathorne,et al.  Economic Control of Quality of Manufactured Product. , 1933 .

[2]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[3]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[4]  F. J. Anscombe,et al.  Rejection of Outliers , 1960 .

[5]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[6]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[7]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[8]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[9]  F. E. Grubbs Procedures for Detecting Outlying Observations in Samples , 1969 .

[10]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[11]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[12]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[13]  W. Stefansky Rejecting Outliers in Factorial Designs , 1972 .

[14]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[15]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[16]  David G. Stork,et al.  Pattern Classification , 1973 .

[17]  A. Hoffman,et al.  Lower bounds for the partitioning of graphs , 1973 .

[18]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[19]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[20]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[21]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[22]  Jerome H. Friedman,et al.  A Recursive Partitioning Decision Rule for Nonparametric Classification , 1977, IEEE Transactions on Computers.

[23]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[24]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[25]  G. Box,et al.  Bayesian analysis of some outlier problems in time series , 1979 .

[26]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[27]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[28]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[29]  Rupert G. Miller,et al.  Survival Analysis , 2022, The SAGE Encyclopedia of Research Design.

[30]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[31]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[32]  Jay L. Devore,et al.  Probability and statistics for engineering and the sciences , 1982 .

[33]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[34]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[35]  Lotfi A. Zadeh,et al.  Commonsense Knowledge Representation Based on Fuzzy Logic , 1983, Computer.

[36]  Thomas G. Dietterich,et al.  A Comparative Review of Selected Methods for Learning from Examples , 1983 .

[37]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[38]  Ryszard S. Michalski,et al.  A Theory and Methodology of Inductive Learning , 1983, Artificial Intelligence.

[39]  R. Higgins Analysis for Financial Management , 2004 .

[40]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[41]  Chandler Stolp,et al.  The Visual Display of Quantitative Information , 1983 .

[42]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[43]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[44]  Christos Faloutsos,et al.  Access methods for text , 1985, CSUR.

[45]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[46]  Edward R. Tufte,et al.  The Visual Display of Quantitative Information , 1986 .

[47]  Douglas H. Fisher,et al.  A Case Study of Incremental Concept Induction , 1986, AAAI.

[48]  Ryszard S. Michalski,et al.  AQ15: Incremental Learning of Attribute-Based Descriptions from Examples: The Method and User's Guide , 1986 .

[49]  Joan Feigenbaum,et al.  Factorization in Experiment Generation , 1986, AAAI.

[50]  J. Devore,et al.  Statistics: The Exploration and Analysis of Data , 1986 .

[51]  I. Bratko,et al.  Learning decision rules in noisy domains , 1987 .

[52]  Jeffrey C. Schlimmer Learning and Representation Change , 1987, AAAI.

[53]  Stephen Jose Hanson,et al.  Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals , 1987, NIPS.

[54]  Kevin D. Ashley,et al.  A case-based system for trade secrets law , 1987, ICAIL '87.

[55]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[56]  Keinosuke Fukunaga,et al.  Bayes Error Estimation Using Parzen and k-NN Procedures , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Herbert A. Simon,et al.  Scientific discovery: compulalional explorations of the creative process , 1987 .

[58]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[59]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[60]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[61]  W. Loh,et al.  Tree-Structured Classification Via Generalized Discriminant Analysis: Rejoinder , 1988 .

[62]  J. Ross Quinlan,et al.  An Empirical Comparison of Genetic and Decision-Tree Classifiers , 1988, ML.

[63]  R. Nakano,et al.  Medical diagnostic expert system based on PDP model , 1988, IEEE 1988 International Conference on Neural Networks.

[64]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[65]  R. Shumway Applied Statistical Time Series Analysis , 1988 .

[66]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[67]  Ray Bareiss,et al.  Protos: An Exemplar-Based Learning Apprentice , 1988, Int. J. Man Mach. Stud..

[68]  Jack Sklansky,et al.  On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..

[69]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[70]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[71]  D. DeWitt,et al.  Equi-depth multidimensional histograms , 1988, SIGMOD '88.

[72]  Phyllis Koton,et al.  Reasoning about Evidence in Causal Explanations , 1988, AAAI.

[73]  Gerald Salton,et al.  Automatic text processing , 1988 .

[74]  Giulia Pagallo,et al.  Learning DNF by Decision Trees , 1989, IJCAI.

[75]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[76]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[77]  Stuart L. Crawford Extensions to the CART Algorithm , 1989, Int. J. Man Mach. Stud..

[78]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[79]  Tariq Samad,et al.  Designing Application-Specific Neural Networks Using the Genetic Algorithm , 1989, NIPS.

[80]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[81]  J. Ross Quinlan,et al.  Unknown Attribute Values in Induction , 1989, ML.

[82]  A. Dobson An introduction to generalized linear models , 1990 .

[83]  Usama M. Fayyad,et al.  What Should Be Minimized in a Decision Tree? , 1990, AAAI.

[84]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[85]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[86]  Edward R. Tufte,et al.  Envisioning Information , 1990 .

[87]  P. Fayers,et al.  The Visual Display of Quantitative Information , 1990 .

[88]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[89]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[90]  Ryszard S. Michalski,et al.  Machine learning: an artificial intelligence approach volume III , 1990 .

[91]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[92]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[93]  David J. Maguire,et al.  Geographical information systems : principles and applications , 1991 .

[94]  Wojciech Ziarko,et al.  The Discovery, Analysis, and Representation of Data Dependencies in Databases , 1991, Knowledge Discovery in Databases.

[95]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[96]  Jiawei Han,et al.  Attribute-Oriented Induction in Relational Databases , 1991, Knowledge Discovery in Databases.

[97]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[98]  W. Scott Spangler,et al.  Learning Useful Rules from Inconclusive Data , 1991, Knowledge Discovery in Databases.

[99]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[100]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[101]  Willi Klösgen,et al.  A Support System for Interpreting Statistical Data , 1991, Knowledge Discovery in Databases.

[102]  Michel Manago,et al.  Induction of Decision Trees from Complex Structured Data , 1991, Knowledge Discovery in Databases.

[103]  Usama M. Fayyad,et al.  The Attribute Selection Problem in Decision Tree Generation , 1992, AAAI.

[104]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[105]  Christos Faloutsos,et al.  Advanced Database Systems , 1997, Lecture Notes in Computer Science.

[106]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition : methods that search for structures in data , 1992 .

[107]  Padhraic Smyth,et al.  An Information Theoretic Approach to Rule Induction from Databases , 1992, IEEE Trans. Knowl. Data Eng..

[108]  Thomas C. Redman,et al.  Data Quality Management and Technology , 1992 .

[109]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[110]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[111]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[112]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[113]  Vasant Dhar,et al.  Abstract-Driven Pattern Discovery in Databases , 1992, IEEE Trans. Knowl. Data Eng..

[114]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[115]  Jiawei Han,et al.  Data-Driven Discovery of Quantitative Rules in Relational Databases , 1993, IEEE Trans. Knowl. Data Eng..

[116]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[117]  Padhraic Smyth,et al.  Image database exploration: progress and challenges , 1993 .

[118]  Salvatore J. Stolfo,et al.  Experiments on multistrategy learning by meta-learning , 1993, CIKM '93.

[119]  William S. Cleveland,et al.  Visualizing Data , 1993 .

[120]  Stephen I. Gallant,et al.  Neural network learning and expert systems , 1993 .

[121]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[122]  Donald E. Brown,et al.  A comparison of decision tree classifiers with backpropagation neural networks for multimodal classification problems , 1992, Pattern Recognit..

[123]  Janet L. Kolodner,et al.  Case-Based Reasoning , 1989, IJCAI 1989.

[124]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[125]  Salvatore J. Stolfo,et al.  Toward Multi-Strategy Parallel & Distributed Learning in Sequence Analysis , 1993, ISMB.

[126]  Ronald R. Yager,et al.  Fuzzy sets, neural networks, and soft computing , 1994 .

[127]  Usama M. Fayyad,et al.  Branching on Attribute Values in Decision Tree Generation , 1994, AAAI.

[128]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[129]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[130]  Isabelle Guyon,et al.  Discovering Informative Patterns and Data Cleaning , 1996, Advances in Knowledge Discovery and Data Mining.

[131]  Heikki Mannila,et al.  The power of sampling in knowledge discovery , 1994, PODS '94.

[132]  Jiawei Han,et al.  Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases , 1994, KDD Workshop.

[133]  Ralf Hartmut Güting,et al.  An introduction to spatial database systems , 1994, VLDB J..

[134]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[135]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[136]  Pedro M. Domingos The RISE system: conquering without separating , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[137]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[138]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[139]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[140]  Michael Stonebraker,et al.  Efficient organization of large multidimensional arrays , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[141]  John Mingers,et al.  Neural Networks, Decision Tree Induction and Discriminant Analysis: an Empirical Comparison , 1994 .

[142]  Bernard Widrow,et al.  Neural networks: applications in industry, business and science , 1994, CACM.

[143]  Lawrence B. Holder,et al.  Substucture Discovery in the SUBDUE System , 1994, KDD Workshop.

[144]  Zbigniew Michalewicz,et al.  Genetic Algorithms Plus Data Structures Equals Evolution Programs , 1994 .

[145]  Bradley P. Allen,et al.  Case-based reasoning: business applications , 1994, CACM.

[146]  C. J. Huberty,et al.  Applied Discriminant Analysis , 1994 .

[147]  Johannes Fürnkranz,et al.  Incremental Reduced Error Pruning , 1994, ICML.

[148]  Hans-Peter Kriegel,et al.  VisDB: database exploration using multidimensional visualization , 1994, IEEE Computer Graphics and Applications.

[149]  James D. Hamilton Time Series Analysis , 1994 .

[150]  Wray L. Buntine Operations for Learning with Graphical Models , 1994, J. Artif. Intell. Res..

[151]  Heikki Mannila,et al.  Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[152]  Goetz Graefe,et al.  Multi-table joins through bitmapped join indices , 1995, SGMD.

[153]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[154]  Daniel S. Hirschberg,et al.  The Time Complexity of Decision Tree Induction , 1995 .

[155]  Hongjun Lu,et al.  NeuroRule: A Connectionist Approach to Data Mining , 1995, VLDB.

[156]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[157]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[158]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[159]  Giuseppe Psaila,et al.  Querying Shapes of Histories , 1995, VLDB.

[160]  Jorma Rissanen,et al.  MDL-Based Decision Tree Pruning , 1995, KDD.

[161]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[162]  Michael S. Waterman,et al.  Introduction to Computational Biology: Maps, Sequences and Genomes , 1998 .

[163]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[164]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[165]  Salvatore J. Stolfo,et al.  Learning Arbiter and Combiner Trees from Partitioned Data for Scaling Machine Learning , 1995, KDD.

[166]  S. Avner Discovery of comprehensible symbolic rules in a neural network , 1995, Proceedings First International Symposium on Intelligence in Neural and Biological Systems. INBS'95.

[167]  Veda C. Storey,et al.  A Framework for Analysis of Data Quality Research , 1995, IEEE Trans. Knowl. Data Eng..

[168]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[169]  Jennifer Widom,et al.  Research problems in data warehousing , 1995, CIKM '95.

[170]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[171]  Igor Kononenko,et al.  On Biases in Estimating Multi-Valued Attributes , 1995, IJCAI.

[172]  Donato Malerba,et al.  A Further Comparison of Simplification Methods for Decision-Tree Induction , 1995, AISTATS.

[173]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[174]  Philip S. Yu,et al.  Efficient parallel data mining for association rules , 1995, CIKM '95.

[175]  Hans-Peter Kriegel,et al.  Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification , 1995, SSD.

[176]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[177]  Jiawei Han,et al.  Meta-Rule-Guided Mining of Association Rules in Relational Databases , 1995, KDOOD/TDOOD.

[178]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[179]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[180]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[181]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[182]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[183]  Erich Schikuta,et al.  Grid-clustering: an efficient hierarchical clustering method for very large data sets , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[184]  Christopher Dean,et al.  Quakefinder: A Scalable Data Mining System for Detecting Earthquakes from Space , 1996, KDD.

[185]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[186]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[187]  Jiawei Han,et al.  Intelligent Query Answering by Knowledge Discovery Techniques , 1996, IEEE Trans. Knowl. Data Eng..

[188]  Pedro M. Domingos,et al.  Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier , 1996, ICML.

[189]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[190]  Carlo Zaniolo,et al.  Metaqueries for Data Mining , 1996, Advances in Knowledge Discovery and Data Mining.

[191]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[192]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[193]  Hong-Ye Gao,et al.  Wavelet analysis [for signal processing] , 1996 .

[194]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[195]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[196]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[197]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[198]  Divesh Srivastava,et al.  Answering Queries with Aggregation Using Views , 1996, VLDB.

[199]  Ramakrishnan Srikant,et al.  The Quest Data Mining System , 1996, KDD.

[200]  Pat Langley,et al.  Static Versus Dynamic Sampling for Data Mining , 1996, KDD.

[201]  Giuseppe Psaila,et al.  A New SQL-like Operator for Mining Association Rules , 1996, VLDB.

[202]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[203]  Yasuhiko Morimoto,et al.  Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization , 1996, SIGMOD '96.

[204]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[205]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .

[206]  Barbara Burke Hubbard The World According to Wavelets: The Story of a Mathematical Technique in the Making, Second Edition , 1996 .

[207]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[208]  Raymond T. Ng,et al.  Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining , 1996, IEEE Trans. Knowl. Data Eng..

[209]  Jiawei Han,et al.  A fast distributed algorithm for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[210]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[211]  G. De Soete,et al.  Clustering and Classification , 2019, Data-Driven Science and Engineering.

[212]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[213]  W. H. Inmon,et al.  Building the data warehouse (2nd ed.) , 1996 .

[214]  Boris Mirkin,et al.  Mathematical Classification and Clustering , 1996 .

[215]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[216]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[217]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[218]  Usama M. Fayyad,et al.  Automating the Analysis and Cataloging of Sky Surveys , 1996, Advances in Knowledge Discovery and Data Mining.

[219]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[220]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[221]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[222]  Igor Kononenko,et al.  Attribute selection for modelling , 1997, Future Gener. Comput. Syst..

[223]  Daniel A. Keim,et al.  Visual Techniques for Exploring Databases , 1997, KDD 1997.

[224]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[225]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1997, International Journal on Digital Libraries.

[226]  Jennifer Widom,et al.  Clustering association rules , 1997, Proceedings 13th International Conference on Data Engineering.

[227]  Peter J. Haas,et al.  The New Jersey Data Reduction Report , 1997 .

[228]  Kenneth A. Ross,et al.  Fast Computation of Sparse Datacubes , 1997, VLDB.

[229]  Manoranjan Dash,et al.  Dimensionality reduction of unsupervised data , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[230]  Edward R. Tufte Visual explanations: images and quantities, evidence and narrative , 1997 .

[231]  George H. John Enhancements to the data mining process , 1997 .

[232]  Prabhakar Raghavan,et al.  Information retrieval algorithms: a survey , 1997, SODA '97.

[233]  Cubing Algorithms, Storage Estimation, and Storage and Processing Alternatives for OLAP , 1997, IEEE Data Eng. Bull..

[234]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[235]  Jiawei Han,et al.  Generalization and decision tree induction: efficient classification in data mining , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[236]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[237]  Mark Sullivan,et al.  Quasi-cubes: exploiting approximations in multidimensional databases , 1997, SGMD.

[238]  Jiawei Han,et al.  Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes , 1997, KDD.

[239]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[240]  J. Wootton Introduction to computational biology: Maps, sequences and genomes; Interdisciplinary statistics , 1997 .

[241]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[242]  Jiawei Han,et al.  GeoMiner: a system prototype for spatial data mining , 1997, SIGMOD '97.

[243]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[244]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[245]  Benjamin Van Roy,et al.  Solving Data Mining Problems Through Pattern Recognition , 1997 .

[246]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[247]  Madhu Sudan,et al.  A statistical perspective on data mining , 1997, Future Gener. Comput. Syst..

[248]  Renée J. Miller,et al.  Association rules over interval data , 1997, SIGMOD '97.

[249]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[250]  Jude W. Shavlik,et al.  Using neural networks for data mining , 1997, Future Gener. Comput. Syst..

[251]  Kathryn B. Laskey,et al.  Network Fragments: Representing Knowledge for Constructing Probabilistic Models , 1997, UAI.

[252]  Elena Baralis,et al.  Materialized Views Selection in a Multidimensional Database , 1997, VLDB.

[253]  David W. Aha,et al.  Simplifying decision trees: A survey , 1997, The Knowledge Engineering Review.

[254]  Clement T. Yu,et al.  Priniples of Database Query Processing for Advanced Applications , 1997 .

[255]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[256]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[257]  C. Apte,et al.  Data mining with decision trees and decision rules , 1997, Future Gener. Comput. Syst..

[258]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[259]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[260]  Heikki Mannila,et al.  Methods and Problems in Data Mining , 1997, ICDT.

[261]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[262]  Wynne Hsu,et al.  Using General Impressions to Analyze Discovered Classification Rules , 1997, KDD.

[263]  Jeffrey F. Naughton,et al.  An array-based algorithm for simultaneous multidimensional aggregates , 1997, SIGMOD '97.

[264]  Yasuhiko Morimoto,et al.  Computing Optimized Rectilinear Regions for Association Rules , 1997, KDD.

[265]  Sunita Sarawagi,et al.  Modeling multidimensional databases , 1997, Proceedings 13th International Conference on Data Engineering.

[266]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[267]  Vipin Kumar,et al.  Scalable parallel data mining for association rules , 1997, SIGMOD '97.

[268]  Michael J. Carey,et al.  Reducing the Braking Distance of an SQL Query Engine , 1998, VLDB.

[269]  Paul M. Aoki Generalizing Search'' in Generalized Search Trees (Extended Abstract) , 1998, ICDE 1998.

[270]  Jiawei Han,et al.  Towards on-line analytical mining in large databases , 1998, SGMD.

[271]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[272]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[273]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[274]  Sunita Sarawagi,et al.  Mining Surprising Patterns Using Temporal Description Length , 1998, VLDB.

[275]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[276]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[277]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[278]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[279]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[280]  Jiawei Han,et al.  MultiMediaMiner: a system prototype for multimedia data mining , 1998, SIGMOD '98.

[281]  Philip S. Yu,et al.  A new framework for itemset generation , 1998, PODS '98.

[282]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[283]  Paul S. Bradley,et al.  Scaling Clustering Algorithms to Large Databases , 1998, KDD.

[284]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[285]  Rajeev Motwani,et al.  Computing Iceberg Queries Efficiently , 1998, VLDB.

[286]  V. S. Subrahmanian Principles of Multimedia Database Systems , 1998 .

[287]  Wai Lam,et al.  Bayesian Network Refinement Via Machine Learning Approach , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[288]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[289]  Sushil Jajodia,et al.  Mining Temporal Relationships with Multiple Granularities in Time Sequences , 1998, IEEE Data Eng. Bull..

[290]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[291]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[292]  Christos Faloutsos,et al.  Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining , 1998, VLDB.

[293]  Breck Baldwin,et al.  Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[294]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[295]  Kenneth A. Ross,et al.  Complex Aggregation at Multiple Granularities , 1998, EDBT.

[296]  Christopher R. Westphal,et al.  Data Mining Solutions: Methods and Tools for Solving Real-World Problems , 1998 .

[297]  Jeffrey Scott Vitter,et al.  Data cube approximation and histograms via wavelets , 1998, CIKM '98.

[298]  Jiawei Han,et al.  Selective Materialization: An Efficient Method for Spatial Data Cube Construction , 1998, PAKDD.

[299]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[300]  Jiawei Han,et al.  Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[301]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[302]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[303]  Andreas D. Baxevanis,et al.  Bioinformatics - a practical guide to the analysis of genes and proteins , 2001, Methods of biochemical analysis.

[304]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[305]  Jiawei Han,et al.  Generalization-Based Data Mining in Object-Oriented Databases Using an Object Cube Model , 1998, Data Knowl. Eng..

[306]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[307]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[308]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[309]  Bernhard Schölkopf,et al.  Shrinking the Tube: A New Support Vector Regression Algorithm , 1998, NIPS.

[310]  Hannu Toivonen,et al.  Efficient discovery of functional and approximate dependencies using partitions , 1998, Proceedings 14th International Conference on Data Engineering.

[311]  Mohammed J. Zaki,et al.  PlanMine: Sequence Mining for Plan Failures , 1998, KDD.

[312]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[313]  Nimrod Megiddo,et al.  Discovery-Driven Exploration of OLAP Data Cubes , 1998, EDBT.

[314]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[315]  Howard J. Hamilton,et al.  Efficient Attribute-Oriented Generalization for Knowledge Discovery from Large Databases , 1998, IEEE Trans. Knowl. Data Eng..

[316]  Witold Pedrycz,et al.  Data Mining Methods for Knowledge Discovery , 1998, IEEE Trans. Neural Networks.

[317]  Shamkant B. Navathe,et al.  Mining for strong negative associations in a large database of customer transactions , 1998, Proceedings 14th International Conference on Data Engineering.

[318]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[319]  Alex Berson,et al.  Building Data Mining Applications for CRM , 1999 .

[320]  H. V. Jagadish,et al.  Semantic Compression and Pattern Extraction with Fascicles , 1999, VLDB.

[321]  Paul S. Bradley,et al.  Compressed data cubes for OLAP aggregate query approximation on continuous dimensions , 1999, KDD '99.

[322]  Peter J. Haas,et al.  Interactive data Analysis: The Control Project , 1999, Computer.

[323]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[324]  Hans-Peter Kriegel,et al.  Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.

[325]  Larry P. English Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits , 1999 .

[326]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[327]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[328]  Ke Wang,et al.  Building Hierarchical Classifiers Using Class Proximity , 1999, VLDB.

[329]  V. J. Rayward-Smith,et al.  Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition , 1999 .

[330]  Laks V. S. Lakshmanan,et al.  Optimization of constrained frequent set queries with 2-variable constraints , 1999, SIGMOD '99.

[331]  Raghu Ramakrishnan,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999, SIGMOD '99.

[332]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[333]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[334]  Y.-S. Shih,et al.  Families of splitting criteria for classification trees , 1999, Stat. Comput..

[335]  Michael J. A. Berry,et al.  Mastering Data Mining: The Art and Science of Customer Relationship Management , 1999 .

[336]  Johannes Gehrke,et al.  BOAT—optimistic decision tree construction , 1999, SIGMOD '99.

[337]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[338]  Oren Etzioni,et al.  Adaptive Web Sites: Conceptual Cluster Mining , 1999, IJCAI.

[339]  Yehuda Lindell,et al.  A Statistical Theory for Quantitative Association Rules , 1999, KDD.

[340]  Raghu Ramakrishnan,et al.  Probabilistic Optimization of Top N Queries , 1999, VLDB.

[341]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[342]  B. Gates Business @ the Speed of Thought , 1999 .

[343]  Kyuseok Shim,et al.  SPIRIT: Sequential Pattern Mining with Regular Expression Constraints , 1999, VLDB.

[344]  Giri Kumar Tayi,et al.  Enhancing data quality in data warehouse environments , 1999, CACM.

[345]  George H. John Behind-the-scenes data mining: a report on the KDD-98 panel , 1999, SKDD.

[346]  Qiang Yang,et al.  Plan Mining by Divide-and-Conquer , 1999, 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[347]  Le Gruenwald,et al.  A survey of data mining and knowledge discovery software tools , 1999, SKDD.

[348]  Ashish Gupta,et al.  Materialized views: techniques, implementations, and applications , 1999 .

[349]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[350]  Jon M. Kleinberg,et al.  Applications of linear algebra in information retrieval and hypertext analysis , 1999, PODS '99.

[351]  Jon M. Kleinberg,et al.  Mining the Web's Link Structure , 1999, Computer.

[352]  Geoffrey A. Moore Crossing the chasm : marketing and selling high-tech products to mainstream customers , 1999 .

[353]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[354]  Jiawei Han,et al.  Efficient Polygon Amalgamation Methods for Spatial OLAP and Spatial Data Mining , 1999, SSD.

[355]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[356]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[357]  Avi Pfeffer,et al.  SPOOK: A system for probabilistic object-oriented knowledge representation , 1999, UAI.

[358]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[359]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[360]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[361]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[362]  John F. Roddick,et al.  An Updated Bibliography of Temporal, Spatial, and Spatio-temporal Data Mining Research , 2000, TSDM.

[363]  Sudipto Guha,et al.  ROCK: A Robust Clustering Algorithm for Categorical Attributes , 2000, Inf. Syst..

[364]  Jian Pei,et al.  Can we push more constraints into frequent pattern mining? , 2000, KDD '00.

[365]  Making Use of the Most Expressive Jumping Emerging Patterns for Classification , 2000, PAKDD.

[366]  Raghu Ramakrishnan,et al.  Proceedings : KDD 2000 : the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 20-23, 2000, Boston, MA, USA , 2000 .

[367]  Martti Juhola,et al.  Informal identification of outliers in medical data , 2000 .

[368]  Christos Faloutsos,et al.  Online data mining for co-evolving time sequences , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[369]  Kristian G. Olesen,et al.  Practical Issues in Modeling Large Diagnostic Systems with Multiply Sectioned Bayesian Networks , 2000, Int. J. Pattern Recognit. Artif. Intell..

[370]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[371]  Jon M. Kleinberg,et al.  Clustering categorical data: an approach based on dynamical systems , 2000, The VLDB Journal.

[372]  Eleazar Eskin,et al.  Anomaly Detection over Noisy Data using Learned Probability Distributions , 2000, ICML.

[373]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[374]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[375]  Laks V. S. Lakshmanan,et al.  Efficient mining of constrained correlated sets , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[376]  Jiawei Han,et al.  Object-Based Selective Materialization for Efficient Implementation of Spatial Data Cubes , 2000, IEEE Trans. Knowl. Data Eng..

[377]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[378]  Xintao Wu,et al.  Using Loglinear Models to Compress Datacube , 2000, Web-Age Information Management.

[379]  Jiawei Han,et al.  Mining recurrent items in multimedia with progressive resolution refinement , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[380]  Heikki Mannila,et al.  Theoretical frameworks for data mining , 2000, SKDD.

[381]  Monique Noirhomme-Fraiture,et al.  Multimedia Support for Complex Multidimensional Data Mining , 2000, MDM/KDD.

[382]  Bill Gates,et al.  Business @ the Speed of Thought: Succeeding in the Digital Economy , 2000 .

[383]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[384]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[385]  David Loshin Enterprise knowledge management: the data quality approach , 2000 .

[386]  Ke Wang,et al.  Mining Frequent Itemsets Using Support Constraints , 2000, VLDB.

[387]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[388]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[389]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[390]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[391]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[392]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[393]  Umeshwar Dayal,et al.  A data-warehouse/OLAP framework for scalable telecommunication tandem traffic analysis , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[394]  Nagwa M. El-Makky,et al.  A note on "beyond market baskets: generalizing association rules to correlations" , 2000, SKDD.

[395]  Hongjun Lu,et al.  Beyond intratransaction association analysis: mining multidimensional intertransaction association rules , 2000, TOIS.

[396]  Eli Upfal,et al.  Stochastic models for the Web graph , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[397]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[398]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD 2000.

[399]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[400]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[401]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[402]  Anthony K. H. Tung,et al.  Mining top-n local outliers in large databases , 2001, KDD '01.

[403]  Anthony K. H. Tung,et al.  Constraint-based clustering in large databases , 2001, ICDT.

[404]  Joseph M. Hellerstein,et al.  Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.

[405]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[406]  Jiong Yang,et al.  TAR: temporal association rules on evolving numerical attributes , 2001, Proceedings 17th International Conference on Data Engineering.

[407]  Donato Malerba,et al.  Discovering Associations between Spatial Objects: An ILP Application , 2001, ILP.

[408]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[409]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[410]  Anthony K. H. Tung,et al.  Spatial clustering in the presence of obstacles , 2001, Proceedings 17th International Conference on Data Engineering.

[411]  Dennis Shasha,et al.  Declarative Data Cleaning: Language, Model, and Algorithms , 2001, VLDB.

[412]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[413]  Ben Taskar,et al.  Learning Probabilistic Models of Relational Structure , 2001, ICML.

[414]  Jian Pei,et al.  Efficient computation of Iceberg cubes with complex measures , 2001, SIGMOD '01.

[415]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[416]  Qiang Chen,et al.  An anomaly detection technique based on a chi‐square statistic for detecting intrusions into information systems , 2001 .

[417]  Sunita Sarawagi,et al.  Intelligent Rollups in Multidimensional OLAP Data , 2001, VLDB.

[418]  Jiawei Han,et al.  Geographic Data Mining and Knowledge Discovery , 2001 .

[419]  Vojislav Kecman,et al.  Learning and Soft Computing: Support Vector Machines, Neural Networks, and Fuzzy Logic Models , 2001 .

[420]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[421]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[422]  Ranga Raju Vatsavai,et al.  Map cube: A visualization tool for spatial data warehouses , 2001 .

[423]  Jian Pei,et al.  Mining Multi-Dimensional Constrained Gradients in Data Cubes , 2001, VLDB.

[424]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[425]  P. S. Horn,et al.  Effect of outliers and nonhealthy individuals on reference interval estimation. , 2001, Clinical chemistry.

[426]  Howard J. Hamilton,et al.  Knowledge discovery and measures of interest , 2001 .

[427]  Paul E. Green,et al.  K-modes Clustering , 2001, J. Classif..

[428]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[429]  Andreas Wierse,et al.  Information Visualization in Data Mining and Knowledge Discovery , 2001 .

[430]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[431]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[432]  Cheng Yang,et al.  Efficient discovery of error-tolerant frequent itemsets in high dimensions , 2001, KDD '01.

[433]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[434]  Dimitrios Gunopulos,et al.  Efficient Mining of Spatiotemporal Patterns , 2001, SSTD.

[435]  Thomas C. Redman,et al.  Data Quality: The Field Guide , 2001 .

[436]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[437]  Robert L. Grossman,et al.  Data Mining for Scientific and Engineering Applications , 2001, Massive Computing.

[438]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[439]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[440]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[441]  Gustavo A. Stolovitzky,et al.  Bioinformatics: The Machine Learning Approach , 2002 .

[442]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[443]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.