Dual-to-kernel learning with ideals

In this paper, we propose a theory which unifies kernel learning and symbolic algebraic methods. We show that both worlds are inherently dual to each other, and we use this duality to combine the structure-awareness of algebraic methods with the efficiency and generality of kernels. The main idea lies in relating polynomial rings to feature space, and ideals to manifolds, then exploiting this generative-discriminative duality on kernel matrices. We illustrate this by proposing two algorithms, IPCA and AVICA, for simultaneous manifold and feature learning, and test their accuracy on synthetic and real world data.

[1]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Franz J. Király,et al.  Error-Minimizing Estimates and Universal Entry-Wise Error Bounds for Low-Rank Matrix Completion , 2013, NIPS.

[3]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[4]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[5]  Sebastian Pokutta,et al.  Approximate computation of zero-dimensional polynomial ideals , 2009, J. Symb. Comput..

[6]  Henry P. Wynn,et al.  Algebraic and geometric methods in statistics , 2009 .

[7]  Seth Sullivant,et al.  Lectures on Algebraic Statistics , 2008 .

[8]  Hans J. Stetter A Study of Strong and Weak Stability in Discretization Algorithms , 1965 .

[9]  IItevor Hattie Principal Curves and Surfaces , 1984 .

[10]  Stephen M. Watt,et al.  The singular value decomposition for polynomial systems , 1995, ISSAC '95.

[11]  Tomas Sauer,et al.  Approximate varieties, approximate ideals and dimension reduction , 2007, Numerical Algorithms.

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[13]  Hans J. Stetter,et al.  Numerical polynomial algebra , 2004 .

[14]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[15]  Bruno Buchberger,et al.  The Construction of Multivariate Polynomials with Preassigned Zeros , 1982, EUROCAM.

[16]  Franz J. Király,et al.  Regression for sets of polynomial equations , 2011, AISTATS.

[17]  青木 敏,et al.  Lectures on Algebraic Statistics (Oberwolfach Seminars Vol.39), Mathias Drton, Bernd Sturmfels and Seth Sullivant 著, Birkhauser, Basel, Boston, Berlin, 2009年3月, 171+viii pp., 価格 24.90i, ISBN 978-3-7643-8904-8 , 2012 .

[18]  H. Wynn,et al.  Algebraic Statistics: Computational Commutative Algebra in Statistics , 2000 .

[19]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[20]  I. Kondor,et al.  Group theoretical methods in machine learning , 2008 .

[21]  Franz J. Király,et al.  Algebraic Geometric Comparison of Probability Distributions , 2012, J. Mach. Learn. Res..

[22]  Roi Livni,et al.  Vanishing Component Analysis , 2013, ICML.

[23]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[24]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.