LCMine: An efficient algorithm for mining discriminative regularities and its application in supervised classification

In this paper, we introduce an efficient algorithm for mining discriminative regularities on databases with mixed and incomplete data. Unlike previous methods, our algorithm does not apply an a priori discretization on numerical features; it extracts regularities from a set of diverse decision trees, induced with a special procedure. Experimental results show that a classifier based on the regularities obtained by our algorithm attains higher classification accuracy, using fewer discriminative regularities than those obtained by previous pattern-based classifiers. Additionally, we show that our classifier is competitive with traditional and state-of-the-art classifiers.

[1]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[2]  Ian H. Witten,et al.  Weka: Practical machine learning tools and techniques with Java implementations , 1999 .

[3]  Krzysztof Walczak,et al.  Jumping Emerging Patterns with Occurrence Count in Image Classification , 2008, PAKDD.

[4]  José Ruiz-Shulcloper,et al.  Logical Combinatorial Pattern Recognition: A Review , 2002 .

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[7]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[8]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[9]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[10]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Sankar K. Pal,et al.  Pattern Recognition: From Classical to Modern Approaches , 2001 .

[12]  Alex S. Taylor,et al.  Machine intelligence , 2009, CHI.

[13]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[14]  Kotagiri Ramamohanarao,et al.  Fast discovery and the generalization of strong jumping emerging patterns for building compact and accurate classifiers , 2006, IEEE Transactions on Knowledge and Data Engineering.

[15]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[16]  Ryszard S. Michalski,et al.  Revealing Conceptual Structure in Data by Inductive Inference , 1982 .

[17]  Kotagiri Ramamohanarao,et al.  Instance-Based Classification by Emerging Patterns , 2000, PKDD.

[18]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[19]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[20]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[21]  Stan Matwin,et al.  Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases , 2007 .

[22]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[23]  Ravi Kothari,et al.  DECISION TREES FOR CLASSIFICATION: A REVIEW AND SOME NEW RESULTS , 2001 .

[24]  James Bailey,et al.  Fast Algorithms for Mining Emerging Patterns , 2002, PKDD.

[25]  Pawel Terlecki,et al.  Adaptive Classification with Jumping Emerging Patterns , 2008, RSKT.

[26]  José Francisco Martínez Trinidad,et al.  The logical combinatorial approach to pattern recognition, an overview through selected works , 2001, Pattern Recognit..

[27]  David G. Stork,et al.  Pattern Classification , 1973 .

[28]  Pawel Terlecki,et al.  Efficient Discovery of Top-K Minimal Jumping Emerging Patterns , 2008, RSCTC.

[29]  Mohammed J. Zaki,et al.  Efficient algorithms for mining closed itemsets and their lattice structure , 2005, IEEE Transactions on Knowledge and Data Engineering.

[30]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.