Understanding Support Vector Machines with Polynomial Kernels

Interpreting models learned by a support vector machine (SVM) is often difficult, if not impossible, due to working in high-dimensional spaces. In this paper, we present an investigation into polynomial kernels for the SVM. We show that the models learned by these machines are constructed from terms related to the statistical moments of the support vectors. This allows us to deepen our understanding of the internal workings of these models and, for example, gauge the importance of combinations of features. We also discuss how the SVM with a quadratic kernel is related to the likelihood-ratio test for normally distributed populations.

[1]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Kellie J. Archer,et al.  Empirical characterization of random forest variable importance measures , 2008, Comput. Stat. Data Anal..

[4]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[5]  Marcus W Beck,et al.  NeuralNetTools: Visualization and Analysis Tools for Neural Networks. , 2018, Journal of statistical software.

[6]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[7]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[8]  Roger G. Melko,et al.  Kernel methods for interpretable machine learning of order parameters , 2017, 1704.05848.

[9]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[10]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[11]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[12]  Julian D. Olden,et al.  Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[13]  Isabelle Guyon,et al.  Structural Risk Minimization for Character Recognition , 1991, NIPS.

[14]  Enrico Bertini,et al.  Using Visual Analytics to Interpret Predictive Machine Learning Models , 2016, ArXiv.

[15]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[16]  Wlodzislaw Duch,et al.  Support Vector Machines for Visualization and Dimensionality Reduction , 2008, ICANN.

[17]  G. David Garson,et al.  Interpreting neural-network connection weights , 1991 .

[18]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[19]  Nathan Intrator,et al.  Interpreting neural-network results: a simulation study , 2001 .

[20]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[21]  Geoffrey E. Hinton,et al.  Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.