A Transfer k-NN Classifier with the Bagging Method

The k-nearest neighbors algorithm is a simple but an effective method in pattern recognition. The classic k-NN classifier is under the assumption that the training dataset and testing dataset are derived from the same dataset and exhibit the identical distribution. However, transfer learning relaxes these two assumptions and applies the knowledge from the related but different source domain to the target domain. In this paper, a transfer k-NN classification, which can be applied in transfer learning, is proposed. Firstly, the distribution discrepancy between the source domain and the target domain is reduced by reweighting the samples of the source domain. Then, a bagging method with the k-NN algorithm is designed to reduce the influence of the source domain for its massive sample size. Experimental results with the real-world dataset demonstrate that the proposed algorithm is more effective than the classical k-NN and the SVM algorithm.

[1]  Amit Dhurandhar,et al.  Probabilistic characterization of nearest neighbor classifier , 2012, International Journal of Machine Learning and Cybernetics.

[2]  Guanghong Gong,et al.  Human performance modeling for manufacturing based on an improved KNN algorithm , 2016 .

[3]  Haroon Idrees,et al.  NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Haitao Liu,et al.  An improved KNN text classification algorithm based on density , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[5]  I-Jing Li,et al.  A SOM-based dimensionality reduction method for KNN classifiers , 2010, 2010 International Conference on System Science and Engineering.

[6]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  Ahmad A. Kardan,et al.  Simultaneous feature selection and feature weighting with K selection for KNN classification using BBO algorithm , 2013, The 5th Conference on Information and Knowledge Technology.

[8]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[9]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[10]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[11]  Julian Szymanski,et al.  Improving css-KNN Classification Performance by Shifts in Training Data , 2015, International KEYSTONE Conference.

[12]  Ashish Sureka,et al.  Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter , 2015, ICDCIT.

[13]  Fleur Mougin,et al.  Large scale biomedical texts classification: a kNN and an ESA-based approaches , 2016, J. Biomed. Semant..

[14]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[15]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[16]  Karol Kozak,et al.  Weighted k-Nearest-Neighbor Techniques for High Throughput Screening Data , 2007 .

[17]  Qingshan Jiang,et al.  Clustering Ensemble based on the Fuzzy KNN Algorithm , 2007, Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007).

[18]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[19]  Shweta Taneja,et al.  MFZ-KNN — A modified fuzzy based K nearest neighbor algorithm , 2015, 2015 International Conference on Cognitive Computing and Information Processing(CCIP).

[20]  Iván Cantador,et al.  Personality-Aware Collaborative Filtering: An Empirical Study in Multiple Domains with Facebook Data , 2014, EC-Web.

[21]  Liu Xiao,et al.  BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification , 2016 .

[22]  Liangxiao Jiang,et al.  Bayesian Citation-KNN with distance weighting , 2014, Int. J. Mach. Learn. Cybern..

[23]  Heping Gou,et al.  An Improved Density-Based Method for Reducing Training Data in KNN , 2013, 2013 International Conference on Computational and Information Sciences.

[24]  Zhang Li,et al.  Weigted-KNN and its application on UCI , 2015, 2015 IEEE International Conference on Information and Automation.

[25]  Liang Gao,et al.  An Improved k-NN Classification with Dynamic k , 2017, ICMLC.

[26]  Yanan Liu,et al.  Improved KNN Classification Algorithm by Dynamic Obtaining K , 2011, ICEC 2011.

[27]  Shih-Wei Lin,et al.  Parameter tuning, feature selection and weight assignment of features for case-based reasoning by artificial immune system , 2011, Appl. Soft Comput..

[28]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Xiaodong Liu,et al.  Use relative weight to improve the kNN for unbalanced text category , 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010).

[30]  Tristan Mary-Huard,et al.  Tailored Aggregation for Classification , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Barnali Sahu,et al.  A Novel Feature Selection Algorithm using Particle Swarm Optimization for Cancer Microarray Data , 2012 .

[32]  Tingshao Zhu,et al.  Evaluating the Validity of Simplified Chinese Version of LIWC in Detecting Psychological Expressions in Short Texts on Social Network Services , 2016, PloS one.

[33]  Saeed Shiry Ghidary,et al.  A heuristic supervised Euclidean data difference dimension reduction for KNN classifier and its application to visual place classification , 2015, Neural Computing and Applications.

[34]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[35]  Alberto Palacios Pawlovsky,et al.  A method to select a good setting for the kNN algorithm when using it for breast cancer prognosis , 2014, IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[36]  Qing Liao,et al.  Weight Based KNN Recommender System , 2013, 2013 5th International Conference on Intelligent Human-Machine Systems and Cybernetics.

[37]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .