Binary Classification in Unstructured Space With Hypergraph Case-Based Reasoning

Binary classification is one of the most common problem in machine learning. It consists in predicting whether a given element belongs to a particular class. In this paper, a new algorithm for binary classification is proposed using a hypergraph representation. The method is agnostic to data representation, can work with multiple data sources or in non-metric spaces, and accommodates with missing values. As a result, it drastically reduces the need for data preprocessing or feature engineering. Each element to be classified is partitioned according to its interactions with the training set. For each class, a seminorm over the training set partition is learnt to represent the distribution of evidence supporting this class. Empirical validation demonstrates its high potential on a wide range of well-known datasets and the results are compared to the state-of-the-art. The time complexity is given and empirically validated. Its robustness with regard to hyperparameter sensitivity is studied and compared to standard classification methods. Finally, the limitation of the model space is discussed, and some potential solutions proposed.

[1]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[2]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[3]  Marcela Xavier Ribeiro,et al.  A statistical decision tree algorithm for medical data stream mining , 2013, Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems.

[4]  Robert E. Tarjan,et al.  Three Partition Refinement Algorithms , 1987, SIAM J. Comput..

[5]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[6]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[7]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[8]  T. L. McCluskey,et al.  A dynamic self-structuring neural network model to combat phishing , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[9]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[10]  Aditya G. Parameswaran,et al.  SeeDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics , 2015, Proc. VLDB Endow..

[11]  Dayou Liu,et al.  A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis , 2011, Expert Syst. Appl..

[12]  Sven F. Crone,et al.  The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing , 2006, Eur. J. Oper. Res..

[13]  Santanu Chaudhury,et al.  Efficient Skin Region Segmentation Using Low Complexity Fuzzy Decision Tree Model , 2009, 2009 Annual IEEE India Conference.

[14]  Alberto Abelló,et al.  On the predictive power of meta-features in OpenML , 2017, Int. J. Appl. Math. Comput. Sci..

[15]  András Kocsor,et al.  Margin Maximizing Discriminant Analysis , 2004, ECML.

[16]  Joel Quintanilla-Domínguez,et al.  WBCD breast cancer database classification applying artificial metaplasticity neural network , 2011, Expert Syst. Appl..

[17]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[18]  Carsten Binnig,et al.  Controlling False Discoveries During Interactive Data Exploration , 2016, SIGMOD Conference.

[19]  Brian D. Ripley,et al.  Tree-based Methods , 1994 .

[20]  Nilesh N. Dalvi,et al.  Robust web extraction: an approach based on a probabilistic tree-edit model , 2009, SIGMOD Conference.

[21]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[22]  Tim Furche,et al.  Data Wrangling for Big Data: Challenges and Opportunities , 2016, EDBT.

[23]  Gilbert Saporta,et al.  The NIPALS Algorithm for Missing Functional Data , 2010 .

[24]  J. Preston Ξ-filters , 1983 .

[25]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[26]  Saratha Sathasivam,et al.  A Hybridised Intelligent Technique for the Diagnosis of Medical Diseases , 2017 .

[27]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[28]  Sohail Asghar,et al.  A Hybrid Model to Detect Phishing-Sites Using Supervised Learning Algorithms , 2016, 2016 International Conference on Computational Science and Computational Intelligence (CSCI).

[29]  Philipp Probst,et al.  Random forest versus logistic regression: a large-scale benchmark experiment , 2018, BMC Bioinformatics.

[30]  Alexandre Quemy,et al.  Data Pipeline Selection and Optimization , 2019, DOLAP.

[31]  Yi Peng,et al.  Evaluation of Classification Algorithms Using MCDM and Rank Correlation , 2012, Int. J. Inf. Technol. Decis. Mak..

[32]  Dominique Brodbeck,et al.  Research directions in data wrangling: Visualizations and transformations for usable and credible data , 2011, Inf. Vis..

[33]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[34]  Radhika Prosad Datta,et al.  Applying rule-based classification techniques to medical databases: an empirical study , 2016 .

[35]  Kemal Polat,et al.  Breast cancer diagnosis using least square support vector machine , 2007, Digit. Signal Process..

[36]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[37]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[38]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[39]  Abraham Bernstein,et al.  "Semantics Inside!" But Let's Not Tell the Data Miners: Intelligent Support for Data Mining , 2014, ESWC.

[40]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[41]  S. Boucheron,et al.  Theory of classification : a survey of some recent advances , 2005 .

[42]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[43]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[44]  Mary Czerwinski,et al.  Visualization of mappings between schemas , 2005, CHI.

[45]  Claude Berge,et al.  Hypergraphs - combinatorics of finite sets , 1989, North-Holland mathematical library.

[46]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[47]  Marc Parizeau,et al.  DEAP: evolutionary algorithms made easy , 2012, J. Mach. Learn. Res..

[48]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[49]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[50]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[51]  Ferhat Özgür Çatak,et al.  Classification with boosting of extreme learning machine over arbitrarily partitioned data , 2015, Soft Computing.

[52]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[53]  Theodore Johnson,et al.  Exploratory Data Mining and Data Cleaning , 2003 .

[54]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[55]  Stef van Buuren,et al.  Flexible Imputation of Missing Data , 2012 .

[56]  Qingshan Liu,et al.  Image retrieval via probabilistic hypergraph ranking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[57]  Gustavo E. A. P. A. Batista,et al.  A Study of K-Nearest Neighbour as an Imputation Method , 2002, HIS.

[58]  Marc Sebban,et al.  Learning stochastic edit distance: Application in handwritten character recognition , 2006, Pattern Recognit..

[59]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[60]  Sebastián Basterrech,et al.  Generalized Linear Models Applied for Skin Identification in Image Processing , 2015, ECC.

[61]  Wa’el Hadi,et al.  ACPRISM: Associative classification based on PRISM algorithm , 2017, Inf. Sci..

[62]  Marc Sebban,et al.  Similarity Learning for Provably Accurate Sparse Linear Classification , 2012, ICML.

[63]  D. Cox The Regression Analysis of Binary Sequences , 1958 .

[64]  Marc Sebban,et al.  A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.

[65]  Mehmet Fatih Akay,et al.  Support vector machines combined with feature selection for breast cancer diagnosis , 2009, Expert Syst. Appl..

[66]  Alexandre Quemy Binary Classification With Hypergraph Case-Based Reasoning , 2018, DOLAP.

[67]  Liangxiao Jiang,et al.  Learning Instance Weighted Naive Bayes from labeled and unlabeled data , 2011, Journal of Intelligent Information Systems.

[68]  Nikolaos Aletras,et al.  Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective , 2016, PeerJ Comput. Sci..

[69]  Laura M. Haas,et al.  Clio grows up: from research prototype to industrial tool , 2005, SIGMOD '05.

[70]  Marc Sebban,et al.  Learning Stochastic Tree Edit Distance , 2006, ECML.

[71]  Elif Derya Übeyli Implementing automated diagnostic systems for breast cancer detection , 2007, Expert Syst. Appl..

[72]  Yue Gao,et al.  Vertex-Weighted Hypergraph Learning for Multi-View Object Classification , 2017, IJCAI.

[73]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[74]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[75]  Davide Chicco,et al.  Ten quick tips for machine learning in computational biology , 2017, BioData Mining.

[76]  Shahram Jafari,et al.  An Expert System for Detection of Breast Cancer Using Data Preprocessing and Bayesian Network , 2011 .

[77]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[78]  Fei Wang,et al.  Survey on distance metric learning and dimensionality reduction in data mining , 2014, Data Mining and Knowledge Discovery.

[79]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[80]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[81]  Boi Faltings,et al.  Hypergraph Learning with Hyperedge Expansion , 2012, ECML/PKDD.

[82]  Elif Derya íbeyli Implementing automated diagnostic systems for breast cancer detection , 2007 .

[83]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[84]  Yi Lin A note on margin-based loss functions in classification , 2004 .