Isolation Forest Based Multi-Source Unsupervised Transfer Learning for Missing GDP Prediction

The rapid growth in industrialization has proportional effect on the increase in carbon emission as well as economic growth of a nation. Nevertheless, there are many nations with unavailable information on their gross domestic products (GDPs). Therefore, primarily, this paper addresses the problem of predicting missing GDP of these nations with the help of their carbon emission data. However, the available data of these countries are insufficient for training a predictive machine learning model. So, we have focused on the emerging yet under-explored area of multi-source unsupervised transfer learning to enlarge the training domain by introducing the detection and removal of anomalies in order to build a robust prediction framework. This is empirically evaluated over the carbon emission and per capita GDP data, collected from the World Bank repository, of a number of developing countries as well as over a set of mixed (developed and developing) countries. Five different domains generated using multi-source unsupervised transfer learning framework are evaluated using three different machine learning models. The best among them is then used to predict the missing per capita GDP of a nation.

[1]  S. Chaabouni,et al.  The dynamic links between carbon dioxide (CO2) emissions, health spending and GDP growth: A case study for 51 countries , 2017, Environmental research.

[2]  Sandeep Kumar,et al.  Random Fuzzy Variable based Uncertainty Modelling for the Prediction of Human Development Index using CO2 Emission Data , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[3]  Miloš Milovančević,et al.  Prediction of GDP growth rate based on carbon dioxide (CO2) emissions , 2016 .

[4]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[5]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[6]  Ajith Abraham,et al.  Industry 4.0: A bibliometric analysis and detailed overview , 2019, Eng. Appl. Artif. Intell..

[7]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[8]  M. Wagner The Environmental Kuznets Curve, Cointegration and Nonlinearity , 2015 .

[9]  Witold Pedrycz,et al.  Multistep Fuzzy Bridged Refinement Domain Adaptation Algorithm and Its Application to Bank Failure Prediction , 2015, IEEE Transactions on Fuzzy Systems.

[10]  Dong Xiang,et al.  Information-theoretic measures for anomaly detection , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[11]  Joachim Denzler,et al.  One-class classification with Gaussian processes , 2013, Pattern Recognit..

[12]  Simon Coupland,et al.  Fuzzy Transfer Learning: Methodology and application , 2015, Inf. Sci..

[13]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[14]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[15]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[16]  Jie Lu,et al.  Text categorization by fuzzy domain adaptation , 2013, 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[17]  M. Wagner,et al.  The Carbon Kuznets Curve: A Cloudy Picture Emitted by Bad Econometrics? , 2008 .

[18]  Jun-Hai Zhai,et al.  Ensemble dropout extreme learning machine via fuzzy integral for data classification , 2018, Neurocomputing.

[19]  Jinbo Bi,et al.  A geometric approach to support vector regression , 2003, Neurocomputing.

[20]  D. Stern Between estimates of the emissions-income elasticity , 2010 .

[21]  Sandeep Kumar,et al.  Atanassov Intuitionistic Fuzzy Domain Adaptation to contain negative transfer learning , 2016, 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[22]  Javad Abbasi Aghamaleki,et al.  Transfer learning approach for classification and noise reduction on noisy web data , 2018, Expert Syst. Appl..

[23]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[24]  C. Tang,et al.  The dynamic links between CO2 emissions, economic growth and coal consumption in China and India , 2013 .

[25]  Hsiao-Tien Pao,et al.  Multivariate Granger causality between CO2 emissions, energy consumption, FDI (foreign direct investment) and GDP (gross domestic product): Evidence from a panel of BRIC (Brazil, Russian Federation, India, and China) countries , 2011 .

[26]  S. Sathya Bama,et al.  Network Intrusion Detection using Clustering: A Data Mining Approach , 2011 .

[27]  Sandeep Kumar,et al.  Interval Type-2 Fuzzy weighted Extreme Learning Machine for GDP Prediction , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[28]  Pranab K. Muhuri,et al.  A Novel GDP Prediction Technique based on Transfer Learning using CO2 Emission Dataset , 2019, Applied Energy.

[29]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[30]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[31]  A. Acheampong Economic growth, CO2 emissions and energy consumption: What causes what and where? , 2018, Energy Economics.

[32]  R. Stott,et al.  The World Bank , 2008, Annals of tropical medicine and parasitology.