Nature-inspired framework of ensemble learning for collaborative classification in granular computing context

Due to the vast and rapid increase in the size of data, machine learning has become an increasingly popular approach of data classification, which can be done by training a single classifier or a group of classifiers. A single classifier is typically learned by using a standard algorithm, such as C4.5. Due to the fact that each of the standard learning algorithms has its own advantages and disadvantages, ensemble learning, such as Bagging, has been increasingly used to learn a group of classifiers for collaborative classification, thus compensating for the disadvantages of individual classifiers. In particular, a group of base classifiers need to be learned in the training stage, and then some or all of the base classifiers are employed for classifying unseen instances in the testing stage. In this paper, we address two critical points that can impact the classification accuracy, in order to overcome the limitations of the Bagging approach. Firstly, it is important to judge effectively which base classifiers qualify to get employed for classifying test instances. Secondly, the final classification needs to be done by combining the outputs of the base classifiers, i.e. voting, which indicates that the strategy of voting can impact greatly on whether a test instance is classified correctly. In order to address the above points, we propose a nature-inspired approach of ensemble learning to improve the overall accuracy in the setting of granular computing. The proposed approach is validated through experimental studies by using real-life data sets. The results show that the proposed approach overcomes effectively the limitations of the Bagging approach.

[1]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Witold Pedrycz,et al.  Granular Computing and Decision-Making: Interactive and Iterative Approaches , 2015 .

[3]  Cesare Furlanello,et al.  Geographical Information Systems and Bootstrap Aggregation (Bagging) of Tree-Based Classifiers for Lyme Disease Risk Prediction in Trentino, Italian Alps , 2002, Journal of medical entomology.

[4]  Shyi-Ming Chen,et al.  Fuzzy risk analysis based on ranking generalized fuzzy numbers with different left heights and right heights , 2012, Expert Syst. Appl..

[5]  Shyi-Ming Chen,et al.  A new method to measure the similarity between fuzzy numbers , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[6]  Han Liu,et al.  Hybrid ensemble learning approach for generation of classification rules , 2015, 2015 International Conference on Machine Learning and Cybernetics (ICMLC).

[7]  Nitesh V. Chawla,et al.  Distributed learning with bagging-like performance , 2003, Pattern Recognit. Lett..

[8]  W. Pedrycz,et al.  Information granules and their use in schemes of knowledge management , 2011, Sci. Iran..

[9]  Adam Lipowski,et al.  Roulette-wheel selection via stochastic acceptance , 2011, ArXiv.

[10]  Igor Kononenko,et al.  Machine Learning and Data Mining: Introduction to Principles and Algorithms , 2007 .

[11]  Shyi-Ming Chen,et al.  Granular Computing and Intelligent Systems , 2011 .

[12]  Shyi-Ming Chen,et al.  Handling forecasting problems based on high-order fuzzy logical relationships , 2011, Expert Syst. Appl..

[13]  Han Liu,et al.  Multi-task learning for intelligent data processing in granular computing context , 2018 .

[14]  Agostino Di Ciaccio,et al.  Improving nonparametric regression methods by bagging and boosting , 2002 .

[15]  Jeng-Shyang Pan,et al.  Fuzzy Rules Interpolation for Sparse Fuzzy Rule-Based Systems Based on Interval Type-2 Gaussian Fuzzy Sets and Genetic Algorithms , 2013, IEEE Transactions on Fuzzy Systems.

[16]  Yiyu Yao,et al.  Perspectives of granular computing , 2005, 2005 IEEE International Conference on Granular Computing.

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  Robert P. W. Duin,et al.  Bagging for linear classifiers , 1998, Pattern Recognit..

[19]  Hisao Ishibuchi,et al.  A Fuzzy Ensemble Learning Method for Pattern Classification , 2003 .

[20]  Han Liu,et al.  Granular computing-based approach for classification towards reduction of bias in ensemble learning , 2017, GRC 2017.

[21]  Shyi-Ming Chen,et al.  Fuzzy multiple attributes group decision-making based on the extension of TOPSIS method and interval type-2 fuzzy sets , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[22]  Jingtao Yao,et al.  Information granulation and granular relationships , 2005, 2005 IEEE International Conference on Granular Computing.

[23]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[24]  Witold Pedrycz,et al.  Information granularity, big data, and computational intelligence , 2015 .

[25]  Peide Liu,et al.  Probabilistic linguistic TODIM approach for multiple attribute decision-making , 2017, GRC 2017.

[26]  Arindam Roy,et al.  A rough multi-objective genetic algorithm for uncertain constrained multi-objective solid travelling salesman problem , 2019 .

[27]  Witold Pedrycz,et al.  Sentiment Analysis and Ontology Engineering - An Environment of Computational Intelligence , 2016, Sentiment Analysis and Ontology Engineering.

[28]  Han Liu,et al.  Fuzzy information granulation towards interpretable sentiment analysis , 2017, GRC 2017.

[29]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[30]  Shyi-Ming Chen,et al.  Granular Computing and Decision-Making , 2015 .

[31]  S. Kar,et al.  Unified Granular-number-based AHP-VIKOR multi-criteria decision framework , 2017, GRC 2017.

[32]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[33]  Shyi-Ming Chen,et al.  Weighted Fuzzy Rule Interpolation Based on GA-Based Weight-Learning Techniques , 2011, IEEE Transactions on Fuzzy Systems.

[34]  Sam Kwong,et al.  Genetic algorithms: concepts and applications [in engineering design] , 1996, IEEE Trans. Ind. Electron..

[35]  Han Liu,et al.  Collaborative decision making by ensemble rule based classification systems , 2015 .

[36]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[37]  Shyi-Ming Chen,et al.  Fuzzy forecasting based on high-order fuzzy logical relationships and automatic clustering techniques , 2011, Expert Syst. Appl..

[38]  Jeng-Shyang Pan,et al.  Forecasting enrollments using automatic clustering techniques and fuzzy logical relationships , 2009, Expert Syst. Appl..

[39]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[40]  Shyi-Ming Chen,et al.  Forecasting enrollments using high‐order fuzzy time series and genetic algorithms , 2006, Int. J. Intell. Syst..

[41]  Yeleny Zulueta-Veliz,et al.  A Choquet integral-based approach to multiattribute decision-making with correlated periods , 2018 .

[42]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[43]  Han Liu,et al.  Nature and biology inspired approach of classification towards reduction of bias in machine learning , 2016, 2016 International Conference on Machine Learning and Cybernetics (ICMLC).

[44]  Zeshui Xu,et al.  Managing multi-granularity linguistic information in qualitative group decision making: an overview , 2016 .

[45]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[46]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[47]  Bruce A. Draper,et al.  Bagging in computer vision , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[48]  Han Liu,et al.  Granular computing based machine learning: a big data processing approach , 2018 .

[49]  Han Liu,et al.  Unified Framework for Control of Machine Learning Tasks Towards Effective and Efficient Processing of Big Data , 2017 .

[50]  W. Pedrycz,et al.  Granular computing and intelligent systems : design with information granules of higher order and higher type , 2011 .

[51]  Shyi-Ming Chen,et al.  Evaluating Students' Answerscripts Using Fuzzy Numbers Associated With Degrees of Confidence , 2006, IEEE Transactions on Fuzzy Systems.

[52]  Sam Kwong,et al.  Genetic algorithms: concepts and applications [in engineering design] , 1996, IEEE Trans. Ind. Electron..

[53]  Geoffrey I. Webb,et al.  Multiple Boosting: A Combination of Boosting and Bagging , 1998 .