Ant-TD: Ant colony optimization plus temporal difference reinforcement learning for multi-label feature selection

Abstract In recent years, multi-label learning becomes a trending topic in machine learning and data mining. This type of learning deals with data that each instance is associated with more than one label. Feature selection is a pre-processing method, which can significantly improve the performance of the multi-label classification. In this paper, we propose a new multi-label feature selection method based on Ant Colony Optimization (ACO). The proposed method makes a significant difference among all ACO-based feature selection methods so that instead of using a static heuristic function, it uses a heuristic learning approach. Because heuristic functions influence the decision-making process of the ACO during the search process, using a heuristic learning approach can help the algorithm to search better in the search space. Here we propose a heuristic learning approach for the ACO heuristic function, which learns from experiences directly by using the Temporal Difference (TD) reinforcement learning algorithm. For heuristic learning, we need to model the ACO problem into a reinforcement learning problem. Thus, we model the feature selection search space into a Markovian Decision Process (MDP) where features represent the states (S), and selecting the unvisited features by each ant represents a set of actions (A). Reward signals (R) are composed of two criteria when ants take action. The ACO state transition rule, which is a combination of both probabilistic and greedy rules, forms the transition function (T) in MDP. In addition to the pheromone that is updated by the “global updating rule” in the ACO, the state-value function (V) is directly updated by the temporal difference formulation to form a learned heuristic function. We conduct various experiments on nine benchmark data and compare the classification performance over three multi-label evaluation measures against nine state-of-the-art multi-label feature selection methods. The results show that the proposed method outperforms competing methods significantly.

[1]  Thomas Stützle,et al.  Ant Colony Optimization , 2009, EMO.

[2]  Parham Moradi,et al.  Integration of graph clustering with ant colony optimization for feature selection , 2015, Knowl. Based Syst..

[3]  Hossein Nezamabadi-pour,et al.  A label-specific multi-label feature selection algorithm based on the Pareto dominance concept , 2019, Pattern Recognit..

[4]  Yongquan Zhou,et al.  An Improved Ant Colony Optimization algorithm to the Periodic Vehicle Routing Problem with Time Window and Service Choice , 2020, Swarm Evol. Comput..

[5]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.

[6]  Sebastián Ventura,et al.  Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context , 2015, Neurocomputing.

[7]  Ḥ. Bayātī,et al.  Multi-label feature selection based on competitive swarm optimization , 2021 .

[8]  Hossein Nezamabadi-pour,et al.  Multilabel feature selection: A comprehensive review and guiding experiments , 2018, WIREs Data Mining Knowl. Discov..

[9]  Parham Moradi,et al.  An unsupervised feature selection algorithm based on ant colony optimization , 2014, Eng. Appl. Artif. Intell..

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[12]  Jesse Read,et al.  A Pruned Problem Transformation Method for Multi-label Classification , 2008 .

[13]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[14]  Jia Zhang,et al.  Mutual information based multi-label feature selection via constrained convex optimization , 2019, Neurocomputing.

[15]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[16]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[17]  C. Watkins Learning from delayed rewards , 1989 .

[18]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[19]  Hossein Nezamabadi-pour,et al.  A Novel Three-Stage Filter-Wrapper Framework for miRNA Subset Selection in Cancer Classification , 2018, Informatics.

[20]  Witold Pedrycz,et al.  Granular multi-label feature selection based on mutual information , 2017, Pattern Recognit..

[21]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[22]  Zhiming Luo,et al.  Manifold regularized discriminative feature selection for multi-label learning , 2019, Pattern Recognit..

[23]  Hossein Nezamabadi-pour,et al.  A bipartite matching-based feature selection for multi-label learning , 2020, International Journal of Machine Learning and Cybernetics.

[24]  Ping Zhang,et al.  Distinguishing two types of labels for multi-label feature selection , 2019, Pattern Recognit..

[25]  Ram Sarkar,et al.  A wrapper-filter feature selection technique based on ant colony optimization , 2019, Neural Computing and Applications.

[26]  Hossein Nezamabadi-pour,et al.  MLACO: A multi-label feature selection algorithm based on ant colony optimization , 2020, Knowl. Based Syst..

[27]  Ali Zakerolhosseini,et al.  Unsupervised probabilistic feature selection using ant colony optimization , 2016, Expert Syst. Appl..

[28]  Francisco Charte,et al.  Working with Multilabel Datasets in R: The mldr Package , 2015, R J..

[29]  Rui Huang,et al.  Manifold-based constraint Laplacian score for multi-label feature selection , 2018, Pattern Recognit. Lett..

[30]  Hossein Nezamabadi-pour,et al.  An advanced ACO algorithm for feature subset selection , 2015, Neurocomputing.

[31]  Mohammad Bagher Dowlatshahi,et al.  MLCR: A Fast Multi-label Feature Selection Method Based on K-means and L2-norm , 2020, 2020 25th International Computer Conference, Computer Society of Iran (CSICC).

[32]  Sebastián Ventura,et al.  Distributed multi-label feature selection using individual mutual information measures , 2020, Knowl. Based Syst..

[33]  Maoguo Gong,et al.  High-order graph matching based on ant colony optimization , 2019, Neurocomputing.

[34]  Zhiwei Ye,et al.  A feature selection method based on modified binary coded ant colony optimization algorithm , 2016, Appl. Soft Comput..

[35]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[36]  Mengjie Zhang,et al.  A survey on swarm intelligence approaches to feature selection in data mining , 2020, Swarm Evol. Comput..

[37]  Michel Verleysen,et al.  Mutual information-based feature selection for multilabel classification , 2013, Neurocomputing.

[38]  Gerd Brewka,et al.  Artificial intelligence - a modern approach by Stuart Russell and Peter Norvig, Prentice Hall. Series in Artificial Intelligence, Englewood Cliffs, NJ , 1996, The Knowledge Engineering Review.

[39]  Dawei Zhao,et al.  Multi-label learning with kernel extreme learning machine autoencoder , 2019, Knowl. Based Syst..

[40]  Hossein Nezamabadi-pour,et al.  Ensemble of Filter-Based Rankers to Guide an Epsilon-Greedy Swarm Optimizer for High-Dimensional Feature Subset Selection , 2017, Inf..

[41]  Mohammad Bagher Dowlatshahi,et al.  MLPSO: A Filter Multi-label Feature Selection Based on Particle Swarm Optimization , 2020, 2020 25th International Computer Conference, Computer Society of Iran (CSICC).

[42]  Marcelo Seido Nagano,et al.  Unsupervised feature selection based on bio-inspired approaches , 2020, Swarm Evol. Comput..

[43]  Parham Moradi,et al.  Relevance-redundancy feature selection based on ant colony optimization , 2015, Pattern Recognit..

[44]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[45]  Hossein Nezamabadi-pour,et al.  MFS-MCDM: Multi-label feature selection using multi-criteria decision making , 2020, Knowl. Based Syst..

[46]  Newton Spolaôr,et al.  A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach , 2013, CLEI Selected Papers.

[47]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[48]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[49]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[50]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[51]  Pradipta Kishore Dash,et al.  Short-term wind speed and wind power prediction using hybrid empirical mode decomposition and kernel ridge regression , 2017, Appl. Soft Comput..

[52]  Vali Derhami,et al.  Winner Determination in Combinatorial Auctions using Hybrid Ant Colony Optimization and Multi-Neighborhood Local Search , 2017 .

[53]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[54]  Ponnuthurai N. Suganthan,et al.  Computing with the collective intelligence of honey bees - A survey , 2017, Swarm Evol. Comput..

[55]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[56]  Hossein Nezamabadi-pour,et al.  MGFS: A multi-label graph-based feature selection algorithm via PageRank centrality , 2020, Expert Syst. Appl..

[57]  Chiman Salavati,et al.  Hybrid fast unsupervised feature selection for high-dimensional data , 2019, Expert Syst. Appl..

[58]  Min-Ling Zhang,et al.  Ml-rbf: RBF Neural Networks for Multi-Label Learning , 2009, Neural Processing Letters.

[59]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[60]  Qiang Yang,et al.  Document Transformation for Multi-label Feature Selection in Text Categorization , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[61]  Hojat Ghimatgar,et al.  An improved feature selection algorithm based on graph clustering and ant colony optimization , 2018, Knowl. Based Syst..

[62]  Yue Wu,et al.  Multimodal continuous ant colony optimization for multisensor remote sensing image registration with local search , 2017, Swarm Evol. Comput..

[63]  Hiroshi Motoda,et al.  Book Review: Computational Methods of Feature Selection , 2007, The IEEE intelligent informatics bulletin.