论文信息 - Learning stochastic decision trees

Learning stochastic decision trees

We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an η-corrupted set of uniform random samples labeled by a size-s stochastic decision tree, our algorithm runs in time nO(log(s/ε)/ε 2) and returns a hypothesis with error within an additive 2η +ε of the Bayes optimal. An additive 2η is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of O(η) + ε was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting. 2012 ACM Subject Classification Theory of computation → Boolean function learning

Guy Blanc | Li-Yang Tan | Jane Lange

[1] Tao Jiang,et al. Lower Bounds on Learning Decision Lists and Trees , 1995, Inf. Comput..

[2] Thomas R. Hancock. Learning kμ decision trees on the uniform distribution , 1993, COLT '93.

[3] Rocco A. Servedio,et al. Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[4] Amit Daniely,et al. ID3 Learns Juntas for Smoothed Product Distributions , 2020, COLT.

[5] Guy Blanc,et al. Universal guarantees for decision tree induction via a higher-order splitting criterion , 2020, NeurIPS.

[6] Adam R. Klivans,et al. Learning Neural Networks with Two Nonlinear Layers in Polynomial Time , 2017, COLT.

[7] Guy Blanc,et al. Top-down induction of decision trees: rigorous guarantees and inherent limitations , 2019, Electron. Colloquium Comput. Complex..

[8] Noam Nisan,et al. Constant depth circuits, Fourier transform, and learnability , 1989, 30th Annual Symposium on Foundations of Computer Science.

[9] Nader H. Bshouty,et al. Exact learning via the Monotone theory , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[10] Ankur Moitra,et al. Beyond the low-degree algorithm: mixtures of subcubes and their applications , 2018, STOC.

[11] Yishay Mansour,et al. Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.

[12] Rocco A. Servedio,et al. On Learning Random DNF Formulas Under the Uniform Distribution , 2005, Theory Comput..

[13] Yishay Mansour,et al. On the boosting ability of top-down decision tree learning algorithms , 1996, STOC '96.

[14] Adam Tauman Kalai,et al. The Hebrew University , 1998 .

[15] Raghu Meka,et al. Learning One Convolutional Layer with Overlapping Patches , 2018, ICML.

[16] Robert E. Schapire,et al. Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[17] David Haussler,et al. Learning decision trees from random examples , 1988, COLT '88.

[18] Adam R. Klivans,et al. Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent , 2020, ICML.

[19] Ronald L. Rivest,et al. Learning decision lists , 2004, Machine Learning.

[20] Adam Tauman Kalai,et al. Agnostically learning decision trees , 2008, STOC.

[21] Dinesh P. Mehta,et al. Decision Tree Approximations of Boolean Functions , 2000, COLT.

[22] R. Schapire,et al. Toward efficient agnostic learning , 1992, COLT '92.

[23] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[24] Guy Blanc,et al. Provable guarantees for decision tree induction: the agnostic setting , 2020, ICML.

[25] Adam R. Klivans,et al. Statistical-Query Lower Bounds via Functional Gradients , 2020, NeurIPS.

[26] Yang Yuan,et al. Hyperparameter Optimization: A Spectral Approach , 2017, ICLR.

[27] Rocco A. Servedio,et al. On the learnability of monotone functions , 2009 .

[28] Ryan O'Donnell,et al. Learning monotone decision trees in polynomial time , 2006, 21st Annual IEEE Conference on Computational Complexity (CCC'06).

[29] Rocco A. Servedio,et al. On Learning Random DNF Formulas Under the Uniform Distribution , 2005, Theory of Computing.

[30] Eyal Kushilevitz,et al. Learning decision trees using the Fourier spectrum , 1991, STOC '91.

[31] Eyal Kushilevitz,et al. PAC learning with nasty noise , 1999, Theor. Comput. Sci..

[32] Rocco A. Servedio,et al. Toward Attribute Efficient Learning of Decision Lists and Parities , 2006, J. Mach. Learn. Res..

[33] Avrim Blum. Rank-r Decision Trees are a Subclass of r-Decision Lists , 1992, Inf. Process. Lett..