Selective ensemble based on extreme learning machine and improved discrete artificial fish swarm algorithm for haze forecast

Urban haze pollution is becoming increasingly serious, which is considered very harmful for humans by World Health Organization (WHO). Haze forecasts can be used to protect human health. In this paper, a Selective ENsemble based on an Extreme Learning Machine (ELM) and Improved Discrete Artificial Fish swarm algorithm (IDAFSEN) is proposed, which overcomes the drawback that a single ELM is unstable in terms of its classification. First, the initial pool of base ELMs is generated by using bootstrap sampling, which is then pre-pruned by calculating the pair-wise diversity measure of each base ELM. Second, partial-based ELMs among the initial pool after pre-pruning with higher precision and with greater diversity are selected by using an Improved Discrete Artificial Fish Swarm Algorithm (IDAFSA). Finally, the selected base ELMs are integrated through majority voting. The Experimental results on 16 datasets from the UCI Machine Learning Repository demonstrate that IDAFSEN can achieve better classification accuracy than other previously reported methods. After a performance evaluation of the proposed approach, this paper looks at how this can be used in haze forecasting in China to protect human health.

[1]  Xiaoqiu Chen,et al.  Pollution Characteristics of PM2.5 during a Typical Haze Episode in Xiamen, China , 2013 .

[2]  Haiyan Chen,et al.  Bagging-like metric learning for support vector regression , 2014, Knowl. Based Syst..

[3]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[4]  Zhang Ling,et al.  Good Point Set Based Genetic Algorithm , 2001 .

[5]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[6]  Tingzhang Liu,et al.  A novel attribute reduction algorithm based on rough set and improved artificial fish swarm algorithm , 2016, Neurocomputing.

[7]  Djamel Bouchaffra,et al.  An efficient ensemble pruning approach based on simple coalitional games , 2017, Inf. Fusion.

[8]  Yang Chu Classifier Ensemble with Diversity: Effectiveness Analysis and Ensemble Optimization , 2014 .

[9]  Ana Maria A. C. Rocha,et al.  A simplified binary artificial fish swarm algorithm for 0-1 quadratic knapsack problems , 2014, J. Comput. Appl. Math..

[10]  Noritaka Shigei,et al.  Bagging and AdaBoost algorithms for vector quantization , 2009, Neurocomputing.

[11]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[12]  Li Li,et al.  Can the Air Pollution Index be used to communicate the health risks of air pollution? , 2015, Environmental pollution.

[13]  Ting Zhang,et al.  Extreme learning machines’ ensemble selection with GRASP , 2015, Applied Intelligence.

[14]  Zhi-Zhong Mao,et al.  An Ensemble ELM Based on Modified AdaBoost.RT Algorithm for Predicting the Temperature of Molten Steel in Ladle Furnace , 2010, IEEE Transactions on Automation Science and Engineering.

[15]  Gordon Reikard Forecasting volcanic air pollution in Hawaii: Tests of time series models , 2012 .

[16]  Zhiwei Ni,et al.  变步长自适应的改进人工鱼群算法 (Self-adaptive Improved Artificial Fish Swarm Algorithm with Changing Step) , 2015, 计算机科学.

[17]  Chen Zhang,et al.  Feature selection method based on multi-fractal dimension and harmony search algorithm and its application , 2016, Int. J. Syst. Sci..

[18]  Grigorios Tsoumakas,et al.  Pruning an ensemble of classifiers via reinforcement learning , 2009, Neurocomputing.

[19]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[20]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[21]  Simone Mantovani,et al.  Study on the influence of ground and satellite observations on the numerical air-quality for PM10 over Romanian territory , 2016 .

[22]  Daniel Hernández-Lobato,et al.  An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  M. Malvoni,et al.  Data on Support Vector Machines (SVM) model to forecast photovoltaic power , 2016, Data in brief.

[24]  Ana Maria A. C. Rocha,et al.  Improved binary artificial fish swarm algorithm for the 0-1 multidimensional knapsack problems , 2014, Swarm Evol. Comput..

[25]  En-Hui Zheng,et al.  Disagreement Measure Based Ensemble of Extreme Learning Machine for Gene Expression Data Classification: Disagreement Measure Based Ensemble of Extreme Learning Machine for Gene Expression Data Classification , 2014 .

[26]  Jianhua Xu,et al.  Urban air quality and regional haze weather forecast for Yangtze River Delta region , 2012 .

[27]  Qi Li,et al.  Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation , 2015 .

[28]  Bo Meng,et al.  A new modeling method based on bagging ELM for day-ahead electricity price prediction , 2010, 2010 IEEE Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA).

[29]  Nan Liu,et al.  Voting based extreme learning machine , 2012, Inf. Sci..

[30]  Ting Zhang,et al.  A new reverse reduce-error ensemble pruning algorithm , 2015, Appl. Soft Comput..

[31]  Samia Boukir,et al.  Margin-based ordered aggregation for ensemble pruning , 2013, Pattern Recognit. Lett..

[32]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[33]  Jingjing Xie,et al.  Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions , 2016 .

[34]  So Young Sohn,et al.  Technology credit scoring model with fuzzy logistic regression , 2016, Appl. Soft Comput..

[35]  R. Venkatesh Babu,et al.  No-reference image quality assessment using modified extreme learning machine classifier , 2009, Appl. Soft Comput..

[36]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[37]  I D Williams,et al.  The impact of communicating information about air pollution events on public health. , 2015, The Science of the total environment.

[38]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[39]  Renjian Zhang,et al.  Characterization of visibility and its affecting factors over Nanjing, China , 2011 .

[40]  Alberto Suárez,et al.  Aggregation Ordering in Bagging , 2004 .

[41]  Nicolás García-Pedrajas,et al.  Boosting instance selection algorithms , 2014, Knowl. Based Syst..

[42]  Li Xiao,et al.  An Optimizing Method Based on Autonomous Animats: Fish-swarm Algorithm , 2002 .

[43]  Yi Lu,et al.  Dissimilarity based ensemble of extreme learning machine for gene expression data classification , 2014, Neurocomputing.

[44]  Yumin Chen,et al.  Finding rough set reducts with fish swarm algorithm , 2015, Knowl. Based Syst..

[45]  Jean-Philippe Vert,et al.  A bagging SVM to learn from positive and unlabeled examples , 2010, Pattern Recognit. Lett..

[46]  Lina Gao,et al.  Visual Range Trends in the Yangtze River Delta Region of China, 1981–2005 , 2011, Journal of the Air & Waste Management Association.

[47]  Gonzalo Martínez-Muñoz,et al.  Pruning in ordered bagging ensembles , 2006, ICML.

[48]  Lu Hui Disagreement Measure Based Ensemble of Extreme Learning Machine for Gene Expression Data Classification , 2013 .

[49]  P. Goyal,et al.  Artificial intelligence based approach to forecast PM2.5 during haze episodes: A case study of Delhi, India , 2015 .

[50]  P. J. García Nieto,et al.  A SVM-based regression model to study the air quality at local scale in Oviedo urban area (Northern Spain): A case study , 2013, Appl. Math. Comput..

[51]  Thiago J. M. Moura,et al.  Combining diversity measures for ensemble pruning , 2016, Pattern Recognit. Lett..

[52]  C. C. Enweremadu,et al.  Prediction of global horizontal solar irradiance in Zimbabwe using artificial neural networks , 2016 .