Cellular Tree Classifiers

The cellular tree classifier model addresses a fundamental problem in the design of classifiers for a parallel or distributed computing world: Given a data set, is it sufficient to apply a majority rule for classification, or shall one split the data into two or more parts and send each part to a potentially different computer (or cell) for further processing? At first sight, it seems impossible to define with this paradigm a consistent classifier as no cell knows the "original data size", $n$. However, we show that this is not so by exhibiting two different consistent classifiers. The consistency is universal but is only shown for distributions with nonatomic marginals.

[1]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[2]  Roland T. Chin,et al.  An Automated Approach to the Design of Decision Tree Classifiers , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[4]  Kurt Mehlhorn,et al.  Data Structures and Algorithms 3: Multi-dimensional Searching and Computational Geometry , 2012, EATCS Monographs on Theoretical Computer Science.

[5]  中澤 真,et al.  Devroye, L., Gyorfi, L. and Lugosi, G. : A Probabilistic Theory of Pattern Recognition, Springer (1996). , 1997 .

[6]  King-Sun Fu,et al.  A method for the design of binary tree classifiers , 1983, Pattern Recognit..

[7]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[8]  Luc Devroye,et al.  Cellular Tree Classifiers , 2013, ALT.

[9]  T. F. Móri On random trees , 2002 .

[10]  Philippe Flajolet,et al.  Analytic Combinatorics , 2009 .

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  L. Bartolucci,et al.  Selective Radiant Temperature Mapping Using a Layered Classifier , 1977, IEEE Transactions on Geoscience Electronics.

[13]  Servane Gey,et al.  Model selection for CART regression trees , 2005, IEEE Transactions on Information Theory.

[14]  Ching Y. Suen,et al.  Large Tree Classifier with Heuristic Search and Global Training , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[16]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[17]  Ching Y. Suen,et al.  Analysis and Design of a Decision Tree Based on Entropy Reduction and Its Application to Large Character Set Recognition , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  W. Loh,et al.  Tree-Structured Classification via Generalized Discriminant Analysis. , 1988 .

[19]  Vinayak S. Naik,et al.  A line in the sand: a wireless sensor network for target detection, classification, and tracking , 2004, Comput. Networks.

[20]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[21]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[22]  Herbert A. David,et al.  Order Statistics , 2011, International Encyclopedia of Statistical Science.

[23]  Wei-Yin Loh,et al.  Tree‐structured classifiers , 2010 .

[24]  Philip A. Chou,et al.  Optimal Partitioning for Classification and Regression Trees , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Seymour Shlien,et al.  Multiple binary decision tree classifiers , 1990, Pattern Recognit..

[26]  King-Sun Fu,et al.  Automated classification of nucleated blood cells using a binary tree classifier , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  William S. Meisel,et al.  An Algorithm for Constructing Optimal Binary Decision Trees , 1977, IEEE Transactions on Computers.

[29]  R. Olshen,et al.  Asymptotically Efficient Solutions to the Classification Problem , 1978 .

[30]  King-Sun Fu,et al.  Automatic classification of cervical cells using a binary tree classifier , 1983, Pattern Recognition.

[31]  Marek W. Kurzysnki The optimal strategy of a tree classifier , 1983, Pattern Recognit..

[32]  Kurt Mehlhorn,et al.  Multi-dimensional searching and computational geometry , 1984 .

[33]  Andreas Holzinger,et al.  Data Mining with Decision Trees: Theory and Applications , 2015, Online Inf. Rev..

[34]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[35]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[36]  William S. Meisel,et al.  A Partitioning Algorithm with Application in Pattern Classification and the Optimization of Decision Trees , 1973, IEEE Transactions on Computers.

[37]  Edward J. Delp,et al.  An iterative growing and pruning algorithm for classification tree design , 1989, Conference Proceedings., IEEE International Conference on Systems, Man and Cybernetics.

[38]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[39]  Hans Ulrich Simon The Vapnik-Chervonenkis Dimension of Decision Trees with Bounded Rank , 1991, Inf. Process. Lett..

[40]  K. C. You,et al.  An Approach to the Design of a Linear Binary Tree Classifier , 2013 .

[41]  Philip H. Swain,et al.  Purdue e-Pubs , 2022 .

[42]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[43]  Ishwar K. Sethi,et al.  Efficient decision tree design for discrete variable pattern recognition problems , 1977, Pattern Recognition.

[44]  Jan van Leeuwen,et al.  Dynamic multi-dimensional data structures based on quad- and k—d trees , 1982, Acta Informatica.

[45]  Saul B. Gelfand,et al.  Classification trees with neural network feature extraction , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[47]  Jian Pei,et al.  Hierarchical distributed data classification in wireless sensor networks , 2010, Comput. Commun..

[48]  J. Friedman A tree-structured approach to nonparametric multiple regression , 1979 .

[49]  Hanan Samet,et al.  The Quadtree and Related Hierarchical Data Structures , 1984, CSUR.

[50]  Pramod K. Varshney,et al.  Application of information theory to the construction of efficient decision trees , 1982, IEEE Trans. Inf. Theory.

[51]  Jack Sklansky,et al.  Automated design of linear tree classifiers , 1990, Pattern Recognit..

[52]  Herbert A. David,et al.  Order Statistics, Third Edition , 2003, Wiley Series in Probability and Statistics.

[53]  Michael I. Jordan,et al.  THE ERA OF BIG DATA , 2011 .

[54]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[55]  Saul Brian Gelfand A nonparametric multiclass partitioning method for classification , 1982 .