Data Mining with Sparse Grids

O(hn−1nd−1) instead of O(hn−d) grid points and unknowns are involved. Here d denotes the dimension of the feature space and hn = 2−n gives the mesh size. To be precise, we suggest to use the sparse grid combination technique [42] where the classification problem is discretized and solved on a certain sequence of conventional grids with uniform mesh sizes in each coordinate direction. The sparse grid solution is then obtained from the solutions on these different grids by linear combination. In contrast to other sparse grid techniques, the combination method is simpler to use and can be parallelized in a natural and straightforward way. We describe the sparse grid combination technique for the classification problem in terms of the regularization network approach. We then give implementational details and discuss the complexity of the algorithm. It turns out that the method scales only linearly with the number of instances, i.e. the amount of data to be classified. Finally we report on the quality of the classifier built by our new method. Here we consider standard test problems from the UCI repository and problems with huge synthetical data sets in up to 9 dimensions. It turns out that our new method achieves correctness rates which are competitive to that of the best existing methods.

[1]  G. Faber Über stetige Funktionen , 1908 .

[2]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[3]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[4]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[5]  F. Utreras Cross-validation techniques for smoothing spline functions in one or two dimensions , 1979 .

[6]  G. Golub,et al.  Good Ridge Parameter , 1979 .

[7]  G. Baszenski n-th Order Polynomial Spline Blending , 1985 .

[8]  G. Wahba A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing Problem , 1985 .

[9]  H. Yserentant On the multi-level splitting of finite element spaces , 1986 .

[10]  Harry Yserentant,et al.  On the multi-level splitting of finite element spaces , 1986 .

[11]  V. N. Temli︠a︡kov Approximation of functions with bounded mixed derivative , 1989 .

[12]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[13]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[14]  Harry Yserentant,et al.  Hierarchical bases , 1992 .

[15]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[16]  Ulrich Rüde,et al.  The Combination Technique for Parallel Sparse-Grid-Preconditioning or -Solution of PDEs on Workstation Networks , 1992, Conference on Algorithms and Hardware for Parallel Processing.

[17]  Michael Griebel,et al.  A combination technique for the solution of sparse grid problems , 1990, Forschungsberichte, TU Munich.

[18]  Josef Hoschek,et al.  Grundlagen der geometrischen Datenverarbeitung (2. Aufl.) , 1992 .

[19]  Hans-joachim Bungartz,et al.  An adaptive poisson solver using hierarchical bases and sparse grids , 1991, Forschungsberichte, TU Munich.

[20]  Hans-Joachim Bungartz,et al.  Dünne Gitter und deren Anwendung bei der adaptiven Lösung der dreidimensionalen Poisson-Gleichung , 1992 .

[21]  Michael Griebel,et al.  The Combination Technique for the Sparse Grid Solution of PDE's on Multiprocessor Machines , 1992, Parallel Process. Lett..

[22]  F. Girosi,et al.  From regularization to radial, tensor and additive splines , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[23]  T. Störtkuhl,et al.  On the Parallel Solution of 3D PDEs on a Network of Workstations and on Vector Computers , 1993 .

[24]  T. Störtkuhl,et al.  On the Parallel Solution of 3D PDEs on a Network of Workstations and on Vector Computers , 1993, Parallel Computer Architectures.

[25]  Brian D. Ripley,et al.  Neural Networks and Related Methods for Classification , 1994 .

[26]  U. Rüde,et al.  Extrapolation, combination, and sparse grid techniques for elliptic boundary value problems , 1992, Forschungsberichte, TU Munich.

[27]  Michael Griebel,et al.  Multilevelmethoden als Iterationsverfahren über Erzeugendensystemen , 1994 .

[28]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[29]  Michael Griebel,et al.  Tensor product type subspace splittings and multilevel iterative methods for anisotropic problems , 1995, Adv. Comput. Math..

[30]  E. Arge,et al.  Approximation of scattered data using smooth grid functions , 1995 .

[31]  Michael Griebel,et al.  The efficient solution of fluid dynamics problems by the combination technique , 1995, Forschungsberichte, TU Munich.

[32]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[33]  Karin Frank,et al.  Information Complexity of Multivariate Fredholm Integral Equations in Sobolev Classes , 1996, J. Complex..

[34]  H. Bungartz,et al.  Sparse Grids: Recent Developments for Elliptic Partial Differential Equations , 1998 .

[35]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[36]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[37]  Bernhard Schölkopf,et al.  The connection between regularization operators and support vector kernels , 1998, Neural Networks.

[38]  Sameer Singh,et al.  2D spiral pattern recognition with possibilistic measures , 1998, Pattern Recognit. Lett..

[39]  Federico Girosi,et al.  An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.

[40]  Witold Pedrycz,et al.  Data Mining Methods for Knowledge Discovery , 1998, IEEE Trans. Neural Networks.

[41]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[42]  William D. Penny,et al.  Bayesian neural networks for classification: how useful is the evidence framework? , 1999, Neural Networks.

[43]  Aihui Zhou,et al.  Error analysis of the combination technique , 1999, Numerische Mathematik.

[44]  Hans-Joachim Bungartz,et al.  A Note on the Complexity of Solving Poisson's Equation for Spaces of Bounded Mixed Derivatives , 1999, J. Complex..

[45]  Tomaso Poggio,et al.  A Unified Framework for Regularization Networks and Support Vector Machines , 1999 .

[46]  W. Sickel,et al.  Interpolation on Sparse Grids and Tensor Products of Nikol'skij–Besov Spaces , 1999 .

[47]  Michael Griebel,et al.  Sparse grids for boundary integral equations , 1999, Numerische Mathematik.

[48]  Linda Kaufman,et al.  Solving the quadratic programming problem arising in support vector classification , 1999 .

[49]  David R. Musicant,et al.  Active Support Vector Machine Classification , 2000, NIPS.

[50]  Maxim A. Olshanskii,et al.  On the Convergence of a Multigrid Method for Linear Reaction-Diffusion Problems , 2000, Computing.

[51]  Glenn Fung,et al.  Data selection for support vector machine classifiers , 2000, KDD '00.

[52]  M. Griebel,et al.  On the computation of the eigenproblems of hydrogen helium in strong magnetic and electric fields with the sparse grid combination technique , 2000 .

[53]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[54]  K. Stuben,et al.  Algebraic Multigrid (AMG) : An Introduction With Applications , 2000 .

[55]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[56]  M. Griebel,et al.  Optimized Tensor-Product Approximation Spaces , 2000 .

[57]  Eric R. Ziegel,et al.  Mastering Data Mining , 2001, Technometrics.

[58]  David R. Musicant,et al.  Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[59]  Michael Griebel,et al.  Data mining with sparse grids using simplicial basis functions , 2001, KDD '01.

[60]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[61]  Michael Griebel,et al.  On the Parallelization of the Sparse Grid Approach for Data Mining , 2001, LSSC.

[62]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.