论文信息 - Learning Functions of Few Arbitrary Linear Parameters in High Dimensions

Learning Functions of Few Arbitrary Linear Parameters in High Dimensions

AbstractLet us assume that f is a continuous function defined on the unit ball of ℝd, of the form f(x)=g(Ax), where A is a k×d matrix and g is a function of k variables for k≪d. We are given a budget m∈ℕ of possible point evaluations f(xi), i=1,…,m, of f, which we are allowed to query in order to construct a uniform approximating function. Under certain smoothness and variation assumptions on the function g, and an arbitrary choice of the matrix A, we present in this paper 1.a sampling choice of the points {xi} drawn at random for each function approximation;2.algorithms (Algorithm 1 and Algorithm 2) for computing the approximating function, whose complexity is at most polynomial in the dimension d and in the number m of points. Due to the arbitrariness of A, the sampling points will be chosen according to suitable random distributions, and our results hold with overwhelming probability. Our approach uses tools taken from the compressed sensing framework, recent Chernoff bounds for sums of positive semidefinite matrices, and classical stability bounds for invariant subspaces of singular value decompositions.

[1] F. R. Gantmakher. The Theory of Matrices , 1984 .

[2] W. Rudin. Function Theory in the Unit Ball of Cn , 1980 .

[3] R. DeVore,et al. Approximation of Functions of Few Variables in High Dimensions , 2011 .

[4] Aswin C. Sankaranarayanan,et al. Compressive Sensing , 2008, Computer Vision, A Reference Guide.

[5] D. Harville. Matrix Algebra From a Statistician's Perspective , 1998 .

[6] Terence Tao,et al. The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[7] J. Tropp. User-Friendly Tail Bounds for Matrix Martingales , 2011 .

[8] E. Candès,et al. Ridgelets: a key to higher-dimensional intermittency? , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[9] E. Candès,et al. Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[10] Yonina C. Eldar,et al. Noise Folding in Compressed Sensing , 2011, IEEE Signal Processing Letters.

[11] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[12] Rudolf Ahlswede,et al. Strong converse for identification via quantum channels , 2000, IEEE Trans. Inf. Theory.

[13] R. DeVore,et al. A Simple Proof of the Restricted Isometry Property for Random Matrices , 2008 .

[14] I. Daubechies,et al. Capturing Ridge Functions in High Dimensions from Point Queries , 2012 .

[15] P. Wojtaszczyk,et al. Complexity of approximation of functions of few variables in high dimensions , 2011, J. Complex..

[16] Allan Pinkus,et al. Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[17] F. John. Plane Waves and Spherical Means: Applied To Partial Differential Equations , 1981 .

[18] R. DeVore,et al. Instance-optimality in probability with an ℓ1-minimization decoder , 2009 .

[19] E. Novak,et al. Tractability of Multivariate Problems , 2008 .

[20] Babak Hassibi,et al. A simplified approach to recovery conditions for low rank matrices , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[21] Massimo Fornasier,et al. Numerical Methods for Sparse Recovery , 2010 .

[22] R. Oliveira. Sums of random Hermitian matrices and an inequality by Rudelson , 2010, 1004.3821.

[23] Mark Rudelson,et al. Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[24] P. Wojtaszczyk. ` 1 minimisation with noisy data , 2011 .

[25] P. Wedin. Perturbation bounds in connection with singular value decomposition , 1972 .

[26] P. Wojtaszczyk. 1 Minimization with Noisy Data , 2012, SIAM J. Numer. Anal..

[27] S. Foucart. A note on guaranteed sparse recovery via ℓ1-minimization , 2010 .

[28] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[29] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .

[30] H. Weyl. Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung) , 1912 .

[31] Jan Vybíral,et al. Compressed learning of high-dimensional sparse functions , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32] Massimo Fornasier,et al. Theoretical Foundations and Numerical Methods for Sparse Recovery , 2010, Radon Series on Computational and Applied Mathematics.

[33] G. Stewart. Perturbation theory for the singular value decomposition , 1990 .

[34] E. Candès. Harmonic Analysis of Neural Networks , 1999 .

[35] Peter Lancaster,et al. The theory of matrices , 1969 .

[36] E. Candès. Ridgelets: estimating with ridge functions , 2003 .

[37] R. DeVore,et al. Compressed sensing and best k-term approximation , 2008 .

[38] B. Logan,et al. Optimal reconstruction of a function from its projections , 1975 .

[39] M. Ledoux. The concentration of measure phenomenon , 2001 .

[40] Dr. M. G. Worster. Methods of Mathematical Physics , 1947, Nature.

[41] Henryk Wozniakowski,et al. Approximation of infinitely differentiable multivariate functions is intractable , 2009, J. Complex..

[42] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[43] D. Freedman,et al. A dozen de Finetti-style results in search of a theory , 1987 .

[44] Heinz H. Bauschke,et al. On Projection Algorithms for Solving Convex Feasibility Problems , 1996, SIAM Rev..