Learning Complexity vs. Communication Complexity

This paper has two main focal points. We first consider an important class of machine learning algorithms - large margin classifiers, such as support vector machines. The notion of margin complexity quantifies the extent to which a given class of functions can be learned by large margin classifiers. We prove that up to a small multiplicative constant, margin complexity is equal to the inverse of discrepancy. This establishes a strong tie between seemingly very different notions from two distinct areas. In the same way that matrix rigidity is related to rank, we introduce the notion of rigidity of margin complexity. We prove that sign matrices with small margin complexity rigidity are very rare. This leads to the question of proving lower bounds on the rigidity of margin complexity. Quite surprisingly, this question turns out to be closely related to basic open problems in communication complexity, e.g., whether PSPACE can be separated from the polynomial hierarchy in communication complexity. There are numerous known relations between the field of learning theory and that of communication complexity, as one might expect since communication is an inherent aspect of learning. The results of this paper constitute another link in this rich web of relations. This link has already proved significant as it was used in the solution of a few open problems in communication complexity.

[1]  Shai Ben-David,et al.  Limitations of Learning Via Embeddings in Euclidean Half Spaces , 2003, J. Mach. Learn. Res..

[2]  Andrew C. Yao,et al.  Lower bounds by probabilistic arguments , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[3]  Hans Ulrich Simon,et al.  On the smallest possible dimension and the largest possible margin of linear arrangements representing given concept classes , 2006, Theor. Comput. Sci..

[4]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[5]  E. Kushilevitz,et al.  Communication Complexity: Basics , 1996 .

[6]  Alfred V. Aho,et al.  On notions of information transfer in VLSI circuits , 1983, STOC.

[7]  Bernard Chazelle,et al.  The discrepancy method - randomness and complexity , 2000 .

[8]  Hans Ulrich Simon,et al.  On the Smallest Possible Dimension and the Largest Possible Margin of Linear Arrangements Representing Given Concept Classes Uniform Distribution , 2002, ALT.

[9]  Jun Tarui Probablistic Polynomials, AC0 Functions, and the Polynomial-Time Hierarchy , 1993, Theor. Comput. Sci..

[10]  Bernard Chazelle,et al.  The Discrepancy Method , 1998, ISAAC.

[11]  Vojtech Rödl,et al.  Geometrical realization of set systems and probabilistic communication complexity , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[12]  Jiri Matousek,et al.  Lectures on discrete geometry , 2002, Graduate texts in mathematics.

[13]  Nathan Linial,et al.  Complexity measures of sign matrices , 2007, Comb..

[14]  Noam Nisan,et al.  On Randomized One-round Communication Complexity , 1999, computational complexity.

[15]  Daniel A. Spielman,et al.  A Remark on Matrix Rigidity , 1997, Inf. Process. Lett..

[16]  Vojtech Rödl,et al.  Some combinatorial-algebraic problems from complexity theory , 1994, Discret. Math..

[17]  Troy Lee,et al.  A Direct Product Theorem for Discrepancy , 2008, 2008 23rd Annual IEEE Conference on Computational Complexity.

[18]  Hans Ulrich Simon,et al.  Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces , 2001, COLT/EuroCOLT.

[19]  G. Jameson Summing and nuclear norms in Banach space theory , 1987 .

[20]  Satyanarayana V. Lokam,et al.  Relations Between Communication Complexity, Linear Arrangements, and Computational Complexity , 2001, FSTTCS.

[21]  A. Beck,et al.  Conference on Modern Analysis and Probability , 1984 .

[22]  Noam Nisan,et al.  On Randomized One-round Communication Complexity , 1995, STOC '95.

[23]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[24]  Alexander A. Sherstov Communication Complexity Under Product and Nonproduct Distributions , 2008, 2008 23rd Annual IEEE Conference on Computational Complexity.

[25]  Jürgen Forster,et al.  A linear lower bound on the unbounded error probabilistic communication complexity , 2001, Proceedings 16th Annual IEEE Conference on Computational Complexity.

[26]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[27]  Peter Frankl,et al.  Complexity classes in communication complexity theory , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[28]  Shahar Mendelson,et al.  On the Limitations of Embedding Methods , 2005, COLT.

[29]  R. Graham,et al.  Handbook of Combinatorics , 1995 .

[30]  Jun Tarui Randomized Polynomials, Threshold Circuits, and the Polynomial Hierarchy , 1991, STACS.

[31]  Nathan Linial,et al.  Lower bounds in communication complexity based on factorization norms , 2007, STOC '07.

[32]  G. Pisier Factorization of Linear Operators and Geometry of Banach Spaces , 1986 .

[33]  Satyanarayana V. Lokam Spectral methods for matrix rigidity with applications to size-depth tradeoffs and communication complexity , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[34]  N. Alon Tools from higher algebra , 1996 .

[35]  Noga Alon,et al.  Approximating the cut-norm via Grothendieck's inequality , 2004, STOC '04.

[36]  A. Razborov,et al.  Improved lower bounds on the rigidity of Hadamard matrices , 1998 .

[37]  Shahar Mendelson,et al.  Embedding with a Lipschitz function , 2005 .

[38]  J. Matousek,et al.  Geometric Discrepancy: An Illustrated Guide , 2009 .

[39]  Leslie G. Valiant,et al.  Graph-Theoretic Arguments in Low-Level Complexity , 1977, MFCS.

[40]  Alexander A. Sherstov Communication Complexity under Product and Nonproduct Distributions , 2008, Computational Complexity Conference.

[41]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[42]  Janos Simon,et al.  Probabilistic Communication Complexity , 1986, J. Comput. Syst. Sci..