论文信息 - Learning the Differential Correlation Matrix of a Smooth Function From Point Samples

Learning the Differential Correlation Matrix of a Smooth Function From Point Samples

Consider an open set $\mathbb{D}\subseteq\mathbb{R}^n$, equipped with a probability measure $\mu$. An important characteristic of a smooth function $f:\mathbb{D}\rightarrow\mathbb{R}$ is its $differential$ $correlation$ $matrix$ $\Sigma_{\mu}:=\int \nabla f(x) (\nabla f(x))^* \mu(dx) \in\mathbb{R}^{n\times n}$, where $\nabla f(x)\in\mathbb{R}^n$ is the gradient of $f(\cdot)$ at $x\in\mathbb{D}$. For instance, the span of the leading $r$ eigenvectors of $\Sigma_{\mu}$ forms an $active$ $subspace$ of $f(\cdot)$, thereby extending the concept of $principal$ $component$ $analysis$ to the problem of $ridge$ $approximation$. In this work, we propose a simple algorithm for estimating $\Sigma_{\mu}$ from point values of $f(\cdot)$ $without$ imposing any structural assumptions on $f(\cdot)$. Theoretical guarantees for this algorithm are provided with the aid of the same technical tools that have proved valuable in the context of covariance matrix estimation from partial measurements.

[1] E. Novak,et al. Tractability of Multivariate Problems , 2008 .

[2] Henryk Wozniakowski,et al. A general theory of optimal algorithms , 1980, ACM monograph series.

[3] A. Samarov. Exploring Regression Structure Using Nonparametric Functional Estimation , 1993 .

[4] G. Lecu'e,et al. Optimal rates and adaptation in the single-index model using aggregation , 2007, math/0703706.

[5] Christopher K. I. Williams,et al. Discovering Hidden Features with Gaussian Processes Regression , 1998, NIPS.

[6] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[7] Anatoli B. Juditsky,et al. NONPARAMETRIC ESTIMATION OF COMPOSITE FUNCTIONS , 2009, 0906.0865.

[8] Rachel A. Ward,et al. A near-stationary subspace for ridge approximation , 2016, 1606.01929.

[9] Paul G. Constantine,et al. Active Subspaces - Emerging Ideas for Dimension Reduction in Parameter Studies , 2015, SIAM spotlights.

[10] J. Polzehl,et al. Structure adaptive approach for dimension reduction , 2001 .

[11] Karim Lounici. High-dimensional covariance matrix estimation with missing observations , 2012, 1201.2577.

[12] C. J. Stone,et al. Additive Regression and Other Nonparametric Models , 1985 .

[13] Ilias Bilionis,et al. Gaussian processes with built-in dimensionality reduction: Applications in high-dimensional uncertainty propagation , 2016, 1602.04550.

[14] Anru Zhang,et al. ROP: Matrix Recovery via Rank-One Projections , 2013, ArXiv.

[15] R. DeVore,et al. Approximation of Functions of Few Variables in High Dimensions , 2011 .

[16] Bing Li,et al. Sufficient dimension reduction based on an ensemble of minimum average variance estimators , 2011, 1203.3313.

[17] R. Tibshirani,et al. Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[18] Volkan Cevher,et al. Active learning of self-concordant like multi-index functions , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19] R. Tibshirani,et al. Generalized Additive Models , 1991 .

[20] H. Tong,et al. Article: 2 , 2002, European Financial Services Law.

[21] P. Wedin. Perturbation bounds in connection with singular value decomposition , 1972 .

[22] I. Daubechies,et al. Capturing Ridge Functions in High Dimensions from Point Queries , 2012 .

[23] I. Johnstone,et al. Projection-Based Approximation and a Duality with Kernel Methods , 1989 .

[24] Andrea J. Goldsmith,et al. Exact and Stable Covariance Estimation From Quadratic Sampling via Convex Programming , 2013, IEEE Transactions on Information Theory.

[25] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[26] Robert D. Nowak,et al. Distilled Sensing: Adaptive Sampling for Sparse Detection and Estimation , 2010, IEEE Transactions on Information Theory.

[27] Michael I. Jordan,et al. Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[28] Ker-Chau Li,et al. Sliced Inverse Regression for Dimension Reduction , 1991 .

[29] V. Cevher,et al. Learning Non-Parametric Basis Independent Models from Point Queries via Low-Rank Methods , 2013, 1310.1826.

[30] Sandra Keiper,et al. Analysis of generalized ridge functions in high dimensions , 2015, 2015 International Conference on Sampling Theory and Applications (SampTA).

[31] R. Dennis Cook,et al. Using Dimension-Reduction Subspaces to Identify Important Inputs in Models of Physical Systems ∗ , 2009 .

[32] Parikshit Shah,et al. Sketching Sparse Matrices, Covariances, and Graphs via Tensor Products , 2015, IEEE Transactions on Information Theory.

[33] Jan Vybíral,et al. Learning Functions of Few Arbitrary Linear Parameters in High Dimensions , 2010, Found. Comput. Math..

[34] Allan Pinkus,et al. Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[35] Eric P. Xing,et al. Consistent Covariance Selection From Data With Missing Values , 2012, ICML.

[36] J. Friedman,et al. Projection Pursuit Regression , 1981 .