Testing and reconstruction via decision trees

We study sublinear and local computation algorithms for decision trees, focusing on testing and reconstruction. Our first result is a tester that runs in $\mathrm{poly}(\log s, 1/\varepsilon)\cdot n\log n$ time, makes $\mathrm{poly}(\log s,1/\varepsilon)\cdot \log n$ queries to an unknown function $f$, and: $\circ$ Accepts if $f$ is $\varepsilon$-close to a size-$s$ decision tree; $\circ$ Rejects if $f$ is $\Omega(\varepsilon)$-far from decision trees of size $s^{\tilde{O}((\log s)^2/\varepsilon^2)}$. Existing testers distinguish size-$s$ decision trees from those that are $\varepsilon$-far from from size-$s$ decision trees in $\mathrm{poly}(s^s,1/\varepsilon)\cdot n$ time with $\tilde{O}(s/\varepsilon)$ queries. We therefore solve an incomparable problem, but achieve doubly-exponential-in-$s$ and exponential-in-$s$ improvements in time and query complexities respectively. We obtain our tester by designing a reconstruction algorithm for decision trees: given query access to a function $f$ that is close to a small decision tree, this algorithm provides fast query access to a small decision tree that is close to $f$. By known relationships, our results yield reconstruction algorithms for numerous other boolean function properties -- Fourier degree, randomized and quantum query complexities, certificate complexity, sensitivity, etc. -- which in turn yield new testers for these properties. Finally, we give a hardness result for testing whether an unknown function is $\varepsilon$-close-to or $\Omega(\varepsilon)$-far-from size-$s$ decision trees. We show that an efficient algorithm for this task would yield an efficient algorithm for properly learning decision trees, a central open problem of learning theory. It has long been known that proper learning algorithms for any class $\mathcal{H}$ yield property testers for $\mathcal{H}$; this provides an example of a converse.

[1]  Rocco A. Servedio,et al.  Testing for Concise Representations , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[2]  Dana Ron,et al.  Property testing and its connection to learning and approximation , 1998, JACM.

[3]  Rocco A. Servedio,et al.  Efficiently Testing Sparse GF(2) Polynomials , 2010, Algorithmica.

[4]  Bernard Chazelle,et al.  Online geometric reconstruction , 2006, SCG '06.

[5]  Ronitt Rubinfeld,et al.  Robust Characterizations of Polynomials with Applications to Program Testing , 1996, SIAM J. Comput..

[6]  Sourav Chakraborty,et al.  Efficient Sample Extractors for Juntas with Applications , 2011, ICALP.

[7]  Zvika Brakerski Local Property Restoring , 2008 .

[8]  Joshua Brody,et al.  Distance-Sensitive Property Testing Lower Bounds , 2013 .

[9]  Rocco A. Servedio Testing by Implicit Learning: A Brief Survey , 2010, Property Testing.

[10]  Avrim Blum,et al.  Active Tolerant Testing , 2018, COLT.

[11]  Ronitt Rubinfeld,et al.  Fast Local Computation Algorithms , 2011, ICS.

[12]  Vitaly Feldman,et al.  Hardness of Proper Learning , 2008, Encyclopedia of Algorithms.

[13]  Avishay Tal,et al.  Properties and applications of boolean function composition , 2013, ITCS '13.

[14]  Yuval Peres,et al.  Noise Tolerance of Expanders and Sublinear Expansion Reconstruction , 2013, SIAM J. Comput..

[15]  Joshua Brody,et al.  Property Testing Lower Bounds via Communication Complexity , 2011, 2011 IEEE 26th Annual Conference on Computational Complexity.

[16]  Terence Tao,et al.  Testability and repair of hereditary hypergraph properties , 2008, Random Struct. Algorithms.

[17]  Sofya Raskhodnikova,et al.  Testing and Reconstruction of Lipschitz Functions with Applications to Data Privacy , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[18]  Guy Kindler,et al.  Testing juntas , 2002, J. Comput. Syst. Sci..

[19]  Bernard Chazelle,et al.  Property-Preserving Data Reconstruction , 2004, Algorithmica.

[20]  Nader H. Bshouty Almost Optimal Testers for Concise Representations , 2019, Electron. Colloquium Comput. Complex..

[21]  Dana Ron,et al.  Testing the diameter of graphs , 1999, RANDOM-APPROX.

[22]  Mark Braverman,et al.  The complexity of properly learning simple concept classes , 2008, J. Comput. Syst. Sci..

[23]  Eldar Fischer,et al.  Query Complexity Lower Bounds for Reconstruction of Codes , 2014, Theory Comput..

[24]  David Haussler,et al.  Learning decision trees from random examples , 1988, COLT '88.

[25]  Michael E. Saks,et al.  Local Monotonicity Reconstruction , 2010, SIAM J. Comput..

[26]  Noga Alon,et al.  Nearly tight bounds for testing function isomorphism , 2011, SODA '11.

[27]  Dana Ron,et al.  Testing Problems with Sublearning Sample Complexity , 2000, J. Comput. Syst. Sci..

[28]  Guy Blanc,et al.  Universal guarantees for decision tree induction via a higher-order splitting criterion , 2020, NeurIPS.

[29]  Guy Blanc,et al.  Estimating decision tree learnability with polylogarithmic sample complexity , 2020, NeurIPS.

[30]  Eyal Kushilevitz,et al.  Learning decision trees using the Fourier spectrum , 1991, STOC '91.

[31]  Ryan O'Donnell,et al.  Every decision tree has an influential variable , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[32]  H. Buhrman,et al.  Complexity measures and decision tree complexity: a survey , 2002, Theor. Comput. Sci..

[33]  Ryan O'Donnell,et al.  Learning monotone decision trees in polynomial time , 2006, 21st Annual IEEE Conference on Computational Complexity (CCC'06).

[34]  Kyomin Jung,et al.  Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners , 2010, APPROX-RANDOM.

[35]  Dana Ron,et al.  Tolerant Junta Testing and the Connection to Submodular Optimization and Function Isomorphism , 2018, SODA.

[36]  Joe Neeman,et al.  Testing surface area with arbitrary accuracy , 2013, STOC.

[37]  Gregory Valiant,et al.  Estimating Learnability in the Sublinear Data Regime , 2018, NeurIPS.

[38]  Ronitt Rubinfeld,et al.  Local Reconstructors and Tolerant Testers for Connectivity and Diameter , 2012, APPROX-RANDOM.

[39]  T. Sanders,et al.  Analysis of Boolean Functions , 2012, ArXiv.

[40]  Gatis Midrijanis Exact quantum query complexity for total Boolean functions , 2004, quant-ph/0403168.

[41]  Ronitt Rubinfeld,et al.  Tolerant property testing and distance approximation , 2006, J. Comput. Syst. Sci..

[42]  Avishay Tal,et al.  Degree vs. approximate degree and Quantum implications of Huang’s sensitivity theorem , 2020, STOC.

[43]  Dana Ron,et al.  On Approximating the Number of Relevant Variables in a Function , 2013, TOCT.

[44]  Ryan O'Donnell,et al.  Testing Surface Area , 2014, SODA.