On the Computational Power of Online Gradient Descent

We prove that the evolution of weight vectors in online gradient descent can encode arbitrary polynomial-space computations, even in very simple learning settings. Our results imply that, under weak complexity-theoretic assumptions, it is impossible to reason efficiently about the fine-grained behavior of online gradient descent.

[1]  Mihalis Yannakakis,et al.  How easy is local search? , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[2]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[3]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[4]  C. Papadimitriou,et al.  Introduction to the Theory of Computation , 2018 .

[5]  Uri M. Ascher,et al.  The Chaotic Nature of Faster Gradient Descent Methods , 2012, J. Sci. Comput..

[6]  Paul W. Goldberg,et al.  The Complexity of the Homotopy Method, Equilibrium Selection, and Lemke-Howson Solutions , 2010, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[7]  Martin Skutella,et al.  The Simplex Algorithm is NP-mighty , 2015, SODA.

[8]  John Fearnley,et al.  The Complexity of the Simplex Method , 2015, STOC.

[9]  Nisheeth K. Vishnoi,et al.  On the Computational Complexity of Limit Cycles in Dynamical Systems , 2015, ITCS.

[10]  Tim Roughgarden,et al.  The Complexity of the k-means Method , 2016, ESA.

[11]  James A. Storer,et al.  On the Complexity of Chess , 1983, J. Comput. Syst. Sci..

[12]  Stefano Soatto,et al.  Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks , 2017, 2018 Information Theory and Applications Workshop (ITA).

[13]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[14]  Christos H. Papadimitriou,et al.  On Simplex Pivoting Rules and Complexity Theory , 2014, IPCO.

[15]  Michael Sipser,et al.  Introduction to the Theory of Computation , 1996, SIGA.