论文信息 - Universal ε-approximators for integrals - 字舞流文

Universal ε-approximators for integrals

Let <i>X</i> be a space and <i>F</i> a family of 0, 1-valued functions on <i>X</i>. Vapnik and Chervonenkis showed that if <i>F</i> is "simple" (finite VC dimension), then for every probability measure μ on <i>X</i> and ε > 0 there is a finite set <i>S</i> such that for all <i>f</i> ε <i>F</i>, Σ<sub><i>x</i>ε<i>s</i></sub> <i>f(x</i>)/|<i>S</i>| = |<i>f f(x)d</i>μ(<i>x</i>)] ± ε. Think of <i>S</i> as a "universal ε-approximator" for integration in <i>F. S</i> can actually be obtained w.h.p. just by sampling a few points from μ. This is a mainstay of computational learning theory. It was later extended by other authors to families of bounded (e.g., [0, 1]-valued) real functions. In this work we establish similar "universal ε-approximators" for families of unbounded nonnegative real functions --- in particular, for the families over which one optimizes when performing data classification. (In this case the ε-approximation should be multiplicative.) Specifically, let <i>F</i> be the family of "<i>k</i>-median functions" (or <i>k</i>-means, etc.) on <i>R</i><sup><i>d</i></sup> with an arbitrary norm ϱ. That is, any set <i>u</i><sub>1</sub>,..., <i>u</i><sub><i>k</i></sub> ε <i>R</i><sup><i>d</i></sup> determines an <i>f</i> by <i>f(x</i>) = (min<sub><i>i</i></sub> ϱ(<i>x</i> - <i>u</i><sub><i>i</i></sub>))<sup>α</sup>. (Here α ≥ 0.) Then for every measure μ on <i>R</i><sup><i>d</i></sup> there exists a set <i>S</i> of cardinality poly(<i>k, d</i>, 1/ε) and a measure <i>v</i> supported on <i>S</i> such that for every <i>f</i> ε <i>F</i>, Σ<sub><i>x</i>ε<i>s</i></sub> <i>f(x)v(x)</i> ε (1 ± ε) · (<i>f f (x)d</i>μ(<i>x</i>)).

L. Schulman | M. Langberg

[1] D. Hilbert. Über die Darstellung definiter Formen als Summe von Formenquadraten , 1888 .

[2] de Ng Dick Bruijn. A combinatorial problem , 1946 .

[3] J. Miller. Numerical Analysis , 1966, Nature.

[4] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[5] Norbert Sauer,et al. On the Density of Families of Sets , 1972, J. Comb. Theory A.

[6] S. Shelah. A combinatorial problem; stability and order for models and theories in infinitary languages. , 1972 .

[7] R. Dudley. Metric Entropy of Some Classes of Sets with Differentiable Boundaries , 1974 .

[8] Branko Grünbaum,et al. Venn Diagrams and Independent Families of Sets. , 1975 .

[9] D. Pollard. Convergence of stochastic processes , 1984 .

[10] David Haussler,et al. Epsilon-nets and simplex range queries , 1986, SCG '86.

[11] David Haussler,et al. ɛ-nets and simplex range queries , 1987, Discret. Comput. Geom..

[12] Vladimir Vapnik,et al. Inductive principles of the search for empirical dependences (methods based on weak convergence of probability measures) , 1989, COLT '89.

[13] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[14] R. Dudley,et al. Uniform and universal Glivenko-Cantelli classes , 1991 .

[15] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[16] Daniel Q. Naiman,et al. Independent collections of translates of boxes and a conjecture due to Grünbaum , 1993, Discret. Comput. Geom..

[17] Robert E. Schapire,et al. Efficient Distribution-Free Learning of Probabilistic , 1994 .

[18] Philip M. Long,et al. Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions , 1995, J. Comput. Syst. Sci..

[19] Philip M. Long,et al. Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.

[20] Joseph O'Rourke,et al. Handbook of Discrete and Computational Geometry, Second Edition , 1997 .

[21] Gert Vegter,et al. In handbook of discrete and computational geometry , 1997 .

[22] Noga Alon,et al. Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.

[23] Claire Mathieu,et al. A Randomized Approximation Scheme for Metric MAX-CUT , 1998, FOCS.

[24] Jon M. Kleinberg,et al. Segmentation problems , 2004, JACM.

[25] Philip M. Long,et al. Prediction, Learning, Uniform Convergence, and Scale-Sensitive Dimensions , 1998, J. Comput. Syst. Sci..

[26] Bernard Chazelle,et al. The Discrepancy Method , 1998, ISAAC.

[27] Noga Alon,et al. On Two Segmentation Problems , 1999, J. Algorithms.

[28] Leonard J. Schulman,et al. Clustering for Edge-Cost Minimization , 1999, Electron. Colloquium Comput. Complex..

[29] Jiri Matousek,et al. Lectures on discrete geometry , 2002, Graduate texts in mathematics.

[30] Michelle Effros,et al. Deterministic clustering with data nets , 2004, Electron. Colloquium Comput. Complex..

[31] Sariel Har-Peled,et al. Coresets for $k$-Means and $k$-Median Clustering and their Applications , 2018, STOC 2004.

[32] Sariel Har-Peled,et al. Smaller Coresets for k-Median and k-Means Clustering , 2005, SCG.

[33] B. K. Natarajan. On Learning Sets and Functions , 1989, Machine Learning.

[34] Sariel Har-Peled,et al. Coresets for Discrete Integration and Clustering , 2006, FSTTCS.

[35] Michael Langberg,et al. Contraction and Expansion of Convex Sets , 2007, CCCG.

[36] Kasturi R. Varadarajan,et al. Geometric Approximation via Coresets , 2007 .

[37] Dan Feldman,et al. A PTAS for k-means clustering based on weak coresets , 2007, SCG '07.

[38] Ke Chen,et al. On Coresets for k-Median and k-Means Clustering in Metric and Euclidean Spaces and Their Applications , 2009, SIAM J. Comput..

[39] Márton Naszódi,et al. On the transversal number and VC-dimension of families of positive homothets of a convex body , 2009, Discret. Math..