Efficient Mining of Contrast Patterns and Their Applications to Classification

Data mining is one of the most important areas in the 21st century with many wide ranging applications. These include medicine, finance, commerce and engineering. Pattern mining is amongst the most important and challenging techniques employed in data mining. Patterns are collections of items which satisfy certain properties. Emerging patterns are those whose frequencies change significantly from one dataset to another. They represent strong contrast knowledge and have been shown to be very successful for constructing accurate and robust classifiers. In this paper, we examine various kinds of contrast patterns. We also investigate efficient pattern mining techniques and discuss how to exploit patterns to construct effective classifiers

[1]  Ronald Christensen,et al.  Log-Linear Models and Logistic Regression , 1997 .

[2]  R. Kotagiri,et al.  Expanding the Training Data Space Using Emerging Patterns and Genetic Methods , 2005 .

[3]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[4]  Jiawei Han,et al.  Knowledge Discovery in Databases: An Attribute-Oriented Approach , 1992, VLDB.

[5]  James Bailey,et al.  Fast Algorithms for Mining Emerging Patterns , 2002, PKDD.

[6]  Kotagiri Ramamohanarao,et al.  Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets , 2000, KDD '00.

[7]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[8]  Huiqing Liu,et al.  Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients , 2003, Bioinform..

[9]  Kotagiri Ramamohanarao,et al.  Efficiently Mining Interesting Emerging Patterns , 2003, WAIM.

[10]  Kotagiri Ramamohanarao,et al.  A weighting scheme based on emerging patterns for weighted support vector machines , 2005, 2005 IEEE International Conference on Granular Computing.

[11]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[12]  Benjamin S. Duran,et al.  Statistical Methods for Engineers and Scientists , 1985 .

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14]  Kotagiri Ramamohanarao,et al.  DeEPs: A New Instance-Based Lazy Discovery and Classification System , 2004, Machine Learning.

[15]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[16]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[17]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[18]  James Bailey,et al.  A fast algorithm for computing hypergraph transversals and its application in mining emerging patterns , 2003, Third IEEE International Conference on Data Mining.

[19]  Evangelos Simoudis,et al.  Mining business databases , 1996, CACM.

[20]  Vipin Kumar,et al.  Mining needle in a haystack: classifying rare classes via two-phase rule induction , 2001, SIGMOD '01.

[21]  Kotagiri Ramamohanarao,et al.  The Application of Emerging Patterns for Improving the Quality of Rare-Class Classification , 2004, PAKDD.

[22]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[23]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[24]  Jinyan Li,et al.  Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. , 2002 .

[25]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[26]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[27]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[28]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[29]  Kotagiri Ramamohanarao,et al.  An Efficient Single-Scan Algorithm for Mining Essential Jumping Emerging Patterns for Classification , 2002, PAKDD.

[30]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[31]  Kotagiri Ramamohanarao,et al.  Using emerging patterns and decision trees in rare-class classification , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[32]  Dimitrios Gunopulos,et al.  Data mining, hypergraph transversals, and machine learning (extended abstract) , 1997, PODS '97.

[33]  Kotagiri Ramamohanarao,et al.  Expanding the Training Data Space Using Emerging Patterns and Genetic Methods , 2005, SDM.

[34]  Kotagiri Ramamohanarao,et al.  Incremental Maintenance on the Border of the Space of Emerging Patterns , 2004, Data Mining and Knowledge Discovery.

[35]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[36]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[37]  Jörg Rech,et al.  Knowledge Discovery in Databases , 2001, Künstliche Intell..

[38]  Kotagiri Ramamohanarao,et al.  Fast discovery and the generalization of strong jumping emerging patterns for building compact and accurate classifiers , 2006, IEEE Transactions on Knowledge and Data Engineering.

[39]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[40]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[41]  Kotagiri Ramamohanarao,et al.  Combining the Strength of Pattern Frequency and Distance for Classification , 2001, PAKDD.

[42]  Kotagiri Ramamohanarao,et al.  Instance-Based Classification by Emerging Patterns , 2000, PKDD.

[43]  Dimitrios Gunopulos,et al.  Data mining, hypergraph transversals, and machine learning (extended abstract) , 1997, PODS.

[44]  James Bailey,et al.  Classification Using Constrained Emerging Patterns , 2003, WAIM.

[45]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[46]  Michèle Sebag,et al.  Delaying the Choice of Bias: A Disjunctive Version Space Approach , 1996, ICML.

[47]  Kotagiri Ramamohanarao,et al.  Making Use of the Most Expressive Jumping Emerging Patterns for Classification , 2001, Knowledge and Information Systems.

[48]  Jinyan Li,et al.  CAEP: Classification by Aggregating Emerging Patterns , 1999, Discovery Science.

[49]  Zhou Wang,et al.  Exploiting Maximal Emerging Patterns for Classification , 2004, Australian Conference on Artificial Intelligence.

[50]  Kotagiri Ramamohanarao,et al.  The Space of Jumping Emerging Patterns and Its Incremental Maintenance Algorithms , 2000, ICML.

[51]  Kotagiri Ramamohanarao,et al.  A Bayesian Approach to Use Emerging Patterns for Classification , 2003, ADC.

[52]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[53]  Ron Kohavi,et al.  MLC++: a machine learning library in C++ , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[54]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[55]  Huiqing Liu,et al.  Discovery of significant rules for classifying cancer diagnosis data , 2003, ECCB.

[56]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[57]  James Bailey,et al.  Mining minimal distinguishing subsequence patterns with gap constraints , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[58]  Jiawei Han,et al.  Data-Driven Discovery of Quantitative Rules in Relational Databases , 1993, IEEE Trans. Knowl. Data Eng..

[59]  R. M. Bethea,et al.  Statistical Methods for Engineers and Scientists. , 1985 .

[60]  Stephen D. Bay,et al.  Detecting Group Differences: Mining Contrast Sets , 2001, Data Mining and Knowledge Discovery.