CHIRP: a new classifier based on composite hypercubes on iterated random projections

We introduce a classifier based on the L-infinity norm. This classifier, called CHIRP, is an iterative sequence of three stages (projecting, binning, and covering) that are designed to deal with the curse of dimensionality, computational complexity, and nonlinear separability. CHIRP is not a hybrid or modification of existing classifiers; it employs a new covering algorithm. The accuracy of CHIRP on widely-used benchmark datasets exceeds the accuracy of competitors. Its computational complexity is sub-linear in number of instances and number of variables and subquadratic in number of classes.

[1]  Eun-Kyung Lee,et al.  Projection Pursuit for Exploratory Supervised Classification , 2005 .

[2]  John Shawe-Taylor,et al.  The Decision List Machine , 2002, NIPS.

[3]  Bowen Alpern,et al.  The hyperbox , 1991, Proceeding Visualization '91.

[4]  Richard G. Priest,et al.  Pattern classification using projection pursuit , 1990, Pattern Recognit..

[5]  Chinmay Hegde,et al.  Random Projections for Manifold Learning , 2007, NIPS.

[6]  Herbert A. Sturges,et al.  The Choice of a Class Interval , 1926 .

[7]  Patrick K. Simpson,et al.  Fuzzy min-max neural networks. I. Classification , 1992, IEEE Trans. Neural Networks.

[8]  Paul F. Lazarsfeld,et al.  Latent Structure Analysis. , 1969 .

[9]  Alberto O. Mendelzon,et al.  Concise descriptions of subsets of structured sets , 2005, TODS.

[10]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[11]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[12]  K.-A. Toh,et al.  A Framework for Empirical Classifiers Comparison , 2006, 2006 1ST IEEE Conference on Industrial Electronics and Applications.

[13]  M. Wand Data-Based Choice of Histogram Bin Width , 1997 .

[14]  P. K. Simpson,et al.  Fuzzy min-max neural networks , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[15]  Robert Tibshirani,et al.  Margin Trees for High-dimensional Classification , 2007, J. Mach. Learn. Res..

[16]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[17]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[18]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[19]  Leland Wilkinson,et al.  An L-infinity Norm Visual Classifier , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[20]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[21]  Miguel Toro,et al.  Decision Queue Classifier for Supervised Learning Using Rotated Hyperboxes , 1998, Ibero-American Conference on AI.

[22]  David A. Landgrebe,et al.  Projection pursuit for high dimensional feature reduction: parallel and sequential approaches , 1995, 1995 International Geoscience and Remote Sensing Symposium, IGARSS '95. Quantitative Remote Sensing for Science and Applications.

[23]  Laks V. S. Lakshmanan,et al.  MDL Summarization with Holes , 2005, VLDB.

[24]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[25]  John Shawe-Taylor,et al.  The Set Covering Machine , 2003, J. Mach. Learn. Res..

[26]  Cao Feng,et al.  STATLOG: COMPARISON OF CLASSIFICATION ALGORITHMS ON LARGE REAL-WORLD PROBLEMS , 1995 .

[27]  D. W. Scott On optimal and data based histograms , 1979 .

[28]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[29]  Ping Li,et al.  Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost , 2010, UAI.

[30]  Kenneth Ward Church,et al.  Very sparse random projections , 2006, KDD '06.

[31]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[32]  Neil Henry Latent structure analysis , 1969 .

[33]  Trevor Hastie,et al.  Regularized Discriminant Analysis and Its Application in Microarrays , 2004 .

[34]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[35]  Fadime Üney Yüksektepe,et al.  A mixed-integer programming approach to multi-class data classification problem , 2006, Eur. J. Oper. Res..

[36]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[37]  Byron J. Gao Hyper-rectangle-based discriminative data generalization and applications in data mining , 2006 .

[38]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[39]  Ian H. Witten,et al.  Weka: Practical machine learning tools and techniques with Java implementations , 1999 .

[40]  J. Tukey A Quick Compact Two Sample Test To Duckworth's Specifications , 1959 .

[41]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[42]  Martin Ester,et al.  Turning Clusters into Patterns: Rectangle-Based Discriminative Data Description , 2006, Sixth International Conference on Data Mining (ICDM'06).

[43]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[44]  P. Bickel,et al.  Some theory for Fisher''s linear discriminant function , 2004 .

[45]  Howard Wainer,et al.  Estimating Coefficients in Linear Models: It Don't Make No Nevermind , 1976 .

[46]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[47]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .