Nonlinear classification, linear clustering, evolutionary semi-supervised three-way decisions: A comparison

This paper compares the semantically meaningful machine learning algorithms with the black box models. The machine learning models are applied to a real world wearable dataset for biometric identification of individuals. The semantically meaningful decision tree is compared with more accurate black-box models such as neural networks, random forest, and support vector machines. The paper further explores the possibility of using unsupervised learning that uses linear distances for separating the categories. Since the distance from the center is used to delineate the clusters, the centroids of the unsupervised clusters provide a semantic profile of the categories. The crisp K-means clustering is enhanced with evolutionary algorithms that primarily uses the distance from the center as the primary criteria, but nudges the clustering towards known classification using a semi-supervised penalty. Finally, the use of rough sets is shown to provide notable semantic information with the help of the three-way decision principle.

[1]  Yang Liu,et al.  An introduction to decision tree modeling , 2004 .

[2]  Wei Wang,et al.  Using support vector machine models for crash injury severity analysis. , 2012, Accident; analysis and prevention.

[3]  Yiyu Yao,et al.  An Outline of a Theory of Three-Way Decisions , 2012, RSCTC.

[4]  Yiyu Yao,et al.  Three-Way Decision: An Interpretation of Rules in Rough Set Theory , 2009, RSKT.

[5]  Pawan Lingras,et al.  Identifying users and activities with cognitive signal processing from a wearable headband , 2016, 2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[6]  M. Gevrey,et al.  Review and comparison of methods to study the contribution of variables in artificial neural network models , 2003 .

[7]  G. Tutz,et al.  An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. , 2009, Psychological methods.

[8]  Yiyu Yao,et al.  Three-way Investment Decisions with Decision-theoretic Rough Sets , 2011, Int. J. Comput. Intell. Syst..

[9]  Yiyu Yao,et al.  The superiority of three-way decisions in probabilistic rough set models , 2011, Inf. Sci..

[10]  Yiyu Yao,et al.  Categorizing Overlapping Regions in Clustering Analysis Using Three-Way Decisions , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[11]  Guoyin Wang,et al.  An automatic method to determine the number of clusters using decision-theoretic rough set , 2014, Int. J. Approx. Reason..

[12]  Cristian Robert Munteanu,et al.  Random Forest classification based on star graph topological indices for antioxidant proteins. , 2013, Journal of theoretical biology.

[13]  Springer-Verlag London Limited Multiple birth support vector machine for multi-class classification , 2013 .

[14]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[15]  Charles M. Bachmann,et al.  Neural Networks and Their Applications , 1994 .

[16]  Hong Yu,et al.  Autonomous Knowledge-oriented Clustering Using Decision-Theoretic Rough Set Theory , 2010, Fundam. Informaticae.

[17]  Minghe Sun,et al.  A Multi-Class Support Vector Machine: Theory and Model , 2013, Int. J. Inf. Technol. Decis. Mak..

[18]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[19]  Nikhil Garge,et al.  ParaKMeans: Implementation of a parallelized K-means algorithm suitable for general laboratory use , 2008, BMC Bioinformatics.

[20]  Guoyin Wang,et al.  A tree-based incremental overlapping clustering method using the three-way decision theory , 2016, Knowl. Based Syst..