Prototypes Within Minimum Enclosing Balls

We revisit the kernel minimum enclosing ball problem and show that it can be solved using simple recurrent neural networks. Once solved, the interior of a ball can be characterized in terms of a function of a set of support vectors and local minima of this function can be thought of as prototypes of the data at hand. For Gaussian kernels, these minima can be naturally found via a mean shift procedure and thus via another recurrent neurocomputing process. Practical results demonstrate that prototypes found this way are descriptive, meaningful, and interpretable.

[1]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[2]  Christian Bauckhage,et al.  Descriptive matrix factorization for sustainability Adopting the principle of opposites , 2011, Data Mining and Knowledge Discovery.

[3]  Alexander Binder,et al.  Deep One-Class Classification , 2018, ICML.

[4]  Christian Bauckhage,et al.  Making Archetypal Analysis Practical , 2009, DAGM-Symposium.

[5]  Tiansi Dong,et al.  Triple Classification Using Regions and Fine-Grained Entity Typing , 2019, AAAI.

[6]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[7]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Christian Bauckhage A Neural Network Implementation of Frank-Wolfe Optimization , 2017, ICANN.

[10]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[11]  Tiansi Dong,et al.  Imposing Category Trees Onto Word-Embeddings Using A Geometric Construction , 2018, ICLR.

[12]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[14]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[15]  Peter Tiño,et al.  Supervised low rank indefinite kernel approximation using minimum enclosing balls , 2018, Neurocomputing.

[16]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[17]  Zhihua Zhang,et al.  Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling , 2013, J. Mach. Learn. Res..

[18]  James T. Kwok,et al.  Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction , 2010, IEEE Transactions on Neural Networks.

[19]  Christian Bauckhage,et al.  Deterministic CUR for Improved Large-Scale Data Analysis: An Empirical Study , 2012, SDM.

[20]  Rafet Sifa An Overview of Frank-Wolfe Optimization for Stochasticity Constrained Interpretable Matrix and Tensor Factorization , 2018, ICANN.

[21]  Boleslaw K. Szymanski,et al.  Some Properties of the Gaussian Kernel for One Class Learning , 2007, ICANN.