On the Infeasibility of Training Neural Networks with Small Squared Errors

We demonstrate that the problem of training neural networks with small (average) squared error is computationally intractable. Consider a data set of M points (Xi, Yi), i = 1,2, ..., M, where Xi are input vectors from Rd, Yi are real outputs (Yi ∈ R). For a network f0 in some class F of neural networks, (1/M) Σi=1M (f0(Xi)- Yi)2)1/2 - inff∈F(1/M) Σi=1M (f(Xi) - Yi)2)1/2 is the (avarage) relative error occurs when one tries to fit the data set by f0. We will prove for several classes F of neural networks that achieving a relative error smaller than some fixed positive threshold (independent from the size of the data set) is NP-hard.