Learning and Lower Bounds for AC0 with Threshold Gates

In 2002 Jackson et al. [JKS02] asked whether AC0 circuits augmented with a threshold gate at the output can be efficiently learned from uniform random examples. We answer this question affirmatively by showing that such circuits have fairly strong Fourier concentration; hence the low-degree algorithm of Linial, Mansour and Nisan [LMN93] learns such circuits in sub-exponential time. Under a conjecture of Gotsman and Linial [GL94] which upper bounds the total influence of low-degree polynomial threshold functions, the running time is quasi-polynomial. Our results extend to AC0 circuits augmented with a small super-constant number of threshold gates at arbitrary locations in the circuit. We also establish some new structural properties of AC0 circuits augmented with threshold gates, which allow us to prove a range of separation results and lower bounds.

[1]  Ryan O'Donnell,et al.  Polynomial regression under arbitrary product distributions , 2010, Machine Learning.

[2]  Thomas Kailath,et al.  Rational approximation techniques for analysis of neural networks , 1994, IEEE Trans. Inf. Theory.

[3]  Ryan O'Donnell,et al.  Learning Monotone Decision Trees in Polynomial Time , 2007, SIAM J. Comput..

[4]  Ryan O'Donnell,et al.  Learning monotone decision trees in polynomial time , 2006, 21st Annual IEEE Conference on Computational Complexity (CCC'06).

[5]  Kristoffer Arnsfelt Hansen Computing Symmetric Boolean Functions by Circuits with Few Exact Threshold Gates , 2007, COCOON.

[6]  Mikael Goldmann,et al.  On the Power of a Threshold Gate at the Top , 1997, Inf. Process. Lett..

[7]  Alexander A. Sherstov The Intersection of Two Halfspaces Has High Threshold Degree , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[8]  Nader H. Bshouty,et al.  On the Fourier spectrum of monotone functions , 1996, JACM.

[9]  Richard Beigel When do extra majority gates help? Polylog (N) majority gates are equivalent to one , 2005, computational complexity.

[10]  Lance Fortnow,et al.  Efficient Learning Algorithms Yield Circuit Lower Bounds , 2006, COLT.

[11]  Ryan O'Donnell,et al.  Learning Geometric Concepts via Gaussian Surface Area , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[12]  Sean W. Smith,et al.  Improved learning of AC0 functions , 1991, COLT '91.

[13]  Daniel A. Spielman,et al.  PP is closed under intersection , 1991, STOC '91.

[14]  O. Svensson,et al.  Inapproximability Results for Sparsest Cut, Optimal Linear Arrangement, and Precedence Constrained Scheduling , 2007, FOCS 2007.

[15]  Johan Håstad,et al.  A Slight Sharpening of LMN , 2001, J. Comput. Syst. Sci..

[16]  Noam Nisan,et al.  Constant depth circuits, Fourier transform, and learnability , 1989, 30th Annual Symposium on Foundations of Computer Science.

[17]  Rocco A. Servedio,et al.  Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[18]  Kathie Cameron,et al.  Monotone path systems in simple regions , 1994, Comb..

[19]  Rocco A. Servedio,et al.  Learnability beyond AC0 , 2002, STOC '02.

[20]  Ryan O'Donnell,et al.  Learning functions of k relevant variables , 2004, J. Comput. Syst. Sci..

[21]  J. Håstad Computational limitations of small-depth circuits , 1987 .

[22]  Rocco A. Servedio,et al.  Learning intersections and thresholds of halfspaces , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[23]  James Aspnes,et al.  The expressive power of voting polynomials , 1991, STOC '91.

[24]  Nathan Linial,et al.  Spectral properties of threshold functions , 1994, Comb..

[25]  Alexander A. Razborov,et al.  Majority gates vs. general weighted threshold gates , 1992, [1992] Proceedings of the Seventh Annual Structure in Complexity Theory Conference.

[26]  Rocco A. Servedio,et al.  Testing for Concise Representations , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).