Predictive Hybridization Model integrating Modified Genetic Algorithm (MGA) and C4.5

Numerous enhancements of prediction models through hybridization and combining various machine learning to increase the prediction model's performance are still an ongoing research interest in data mining. This study is a modification of GA with a new crossover mating structure called Flip Multi-Sliced Average Crossover (FMSAX) operator with a rank-based selection function integrating the C4.5 algorithm. To measure the accuracy level of the C4.5, the dataset is split into 70:30 for training and testing, respectively. The results showed that the prediction model of the modified GA combined with the C4.5 algorithm outperformed the C4.5 prediction model, the GA having AX and roulette wheel selection function. The results showed that the prediction model obtained an accuracy value of 98.6207%, where its MAE, RMSE, Precision, Recall, and F-Measure values are 0.0066, 0.0738, 0.987, 0.986, and 0.986, respectively.

[1]  Edson E. Cruz-Miguel,et al.  Fuzzy Logic and Genetic-Based Algorithm for a Servo Control System , 2022, Micromachines.

[2]  Hao Zhang,et al.  The Use of Genetic Algorithm, Multikernel Learning, and Least-Squares Support Vector Machine for Evaluating Quality of Teaching , 2022, Scientific Programming.

[3]  Jan Carlo T. Arroyo,et al.  An Optimized Neural Network Using Genetic Algorithm for Cardiovascular Disease Prediction , 2022, Journal of Advances in Information Technology.

[4]  S. Afshar,et al.  Application of Genetic Algorithm-Based Support Vector Machine in Identification of Gene Expression Signatures for Psoriasis Classification: A Hybrid Model , 2021, BioMed research international.

[5]  Devender Kumar Sharma,et al.  Integration of genetic algorithm with artificial neural network for stock market forecasting , 2021, International Journal of System Assurance Engineering and Management.

[6]  Georg Grossmann,et al.  Using Big Data to Improve Safety Performance: An Application of Process Mining to Enhance Data Visualisation , 2021, Big Data Res..

[7]  Ming Jia,et al.  Towards a comprehensive optimization of engine efficiency and emissions by coupling artificial neural network (ANN) with genetic algorithm (GA) , 2021 .

[8]  H. Yuliansyah,et al.  Predicting Students Graduate on Time Using C4.5 Algorithm , 2021, Journal of Information Systems Engineering and Business Intelligence.

[9]  R. Arifudin,et al.  Optimization of the C4.5 Algorithm by Using a Genetic Algorithm for the Diagnosis of Life Expectancy for Hepatitis Patients , 2021, Journal of Advances in Information Systems and Technology.

[10]  R. Arifudin,et al.  Optimization of Classification Accuracy Using K-Means and Genetic Algorithm by Integrating C4.5 Algorithm for Diagnosis Breast Cancer Disease , 2021, Journal of Advances in Information Systems and Technology.

[11]  Haiyan Wang,et al.  Improved KNN Algorithm Based on Preprocessing of Center in Smart Cities , 2021, Complex..

[12]  Derya Birant,et al.  Machine learning and data mining in manufacturing , 2021, Expert Syst. Appl..

[13]  Maad M. Mijwil,et al.  Utilizing the Genetic Algorithm to Pruning the C4.5 Decision Tree Algorithm , 2021, Asian Journal of Applied Sciences.

[14]  Naveed Anjum,et al.  Improved Genetic Algorithm Integrated with Scheduling Rules for Flexible Job Shop Scheduling Problems , 2021 .

[15]  Markdy Y. Orong,et al.  A Novel Approach in Determining the Reasons of Student Attrition based on Enhanced Genetic Algorithm with Cross-Average Crossover Operator , 2021 .

[16]  R. Mittal,et al.  Integrating Genetic Algorithm with Random Forest for Improving the Classification Performance of Web Log Data , 2020, 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC).

[17]  Allemar Jhone P. Delima An Enhanced K-Nearest Neighbor Predictive Model through Metaheuristic Optimization , 2020, International Journal of Engineering and Technology Innovation.

[18]  Allemar Jhone P. Delima,et al.  An Enhanced K-Nearest Neighbor Predictive Model through Metaheuristic Optimization , 2020, International Journal of Advanced Computer Science and Applications.

[19]  Shanmugam Sundaramurthy,et al.  A hybrid Grey Wolf Optimization and Particle Swarm Optimization with C4.5 approach for prediction of Rheumatoid Arthritis , 2020, Appl. Soft Comput..

[20]  Muhammad Qasim Shabbir,et al.  Application of big data analytics and organizational performance: the mediating role of knowledge management practices , 2020, Journal of Big Data.

[21]  Ali Idri,et al.  Data preprocessing for heart disease classification: A systematic literature review , 2020, Comput. Methods Programs Biomed..

[22]  Danny Manongga,et al.  The Utilization of Naive Bayes and C.45 in Predicting The Timeliness of Students’ Graduation , 2020, Scientific Journal of Informatics.

[23]  Zaid Ameen Abduljabbar,et al.  An effective image retrieval based on optimized genetic algorithm utilized a novel SVM-based convolutional neural network classifier , 2019, Human-centric Computing and Information Sciences.

[24]  Cenk Budayan,et al.  Implementation of Genetic Algorithm Integrated with the Deep Neural Network for Estimating at Completion Simulation , 2019, Advances in Civil Engineering.

[25]  Francisco José Climent Diranzo,et al.  Predicting failure in the U.S. banking sector: An extreme gradient boosting approach , 2019, International Review of Economics & Finance.

[26]  Jie Dou,et al.  Evaluating GIS-Based Multiple Statistical Models and Data Mining for Earthquake and Rainfall-Induced Landslide Susceptibility Using the LiDAR DEM , 2019, Remote. Sens..

[27]  Mohamed Elhoseny,et al.  Feature selection based on artificial bee colony and gradient boosting decision tree , 2019, Appl. Soft Comput..

[28]  Ariel M. Sison,et al.  A Hybrid Prediction Model Integrating a Modified Genetic Algorithm to K-means Segmentation and C4.5 , 2018, TENCON 2018 - 2018 IEEE Region 10 Conference.

[29]  Aswan Supriyadi Sunge,et al.  Optimasi Algoritma C4.5 Dalam Prediksi Web Phishing Menggunakan Seleksi Fitur Genetic Algoritma , 2018 .

[30]  Amit Kumar,et al.  A Hybrid Predictive Model Integrating C4.5 and Decision Table Classifiers for Medical Data Sets , 2018, J. Inf. Technol. Res..

[31]  Vikalp Ravi Jain,et al.  Analysis and Prediction of Individual Stock Prices of Financial Sector Companies in NIFTY50 , 2018 .

[32]  P. S. Oliveira,et al.  Genetic algorithm driven ANN model for river level in small water basins , 2018 .

[33]  K. Wins,et al.  Integrated ANN-GA Approach For Predictive Modeling And Optimization Of Grinding Parameters With Surface Roughness As The Response , 2018 .

[34]  Shengqi Yang,et al.  Type 2 diabetes mellitus prediction model based on data mining , 2018 .

[35]  Muhammd Jawad Hamid Mughal Data Mining: Web Data Mining Techniques, Tools and Algorithms: An Overview , 2018 .

[36]  Mir Riyanul Islam,et al.  Mining trailers data from youtube for predicting gross income of movies , 2017, 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC).

[37]  Okfalisa,et al.  Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification , 2017, 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE).

[38]  Abid Hussain,et al.  Genetic Algorithm for Traveling Salesman Problem with Modified Cycle Crossover Operator , 2017, Comput. Intell. Neurosci..

[39]  Fahmi,et al.  Forecasting of raw material needed for plastic products based in income data using ARIMA method , 2017, 2017 5th International Conference on Electrical, Electronics and Information Engineering (ICEEIE).

[40]  D. Sathik,et al.  CLASSIFICATION OF BREAST CANCER DATA USING C4.5 CLASSIFIER ALGORITHM , 2017 .

[41]  Jittaporn Tarapitakwong A Classification Model for Predicting Standard Levels of OTOP ’ s Wood Handicraft Products by Using the K-Nearest Neighbor , 2017 .

[42]  Jian Yang,et al.  Genetic algorithm optimized training for neural network spectrum prediction , 2016, 2016 2nd IEEE International Conference on Computer and Communications (ICCC).

[43]  Hossein Gharaee,et al.  A new feature selection IDS based on genetic algorithm and SVM , 2016, 2016 8th International Symposium on Telecommunications (IST).

[44]  Vinayak Hegde,et al.  Prediction of students performance using Educational Data Mining , 2016, 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE).

[45]  Alejandro Baldominos Gómez,et al.  Feature Set Optimization for Physical Activity Recognition Using Genetic Algorithms , 2015, GECCO.

[46]  Wei Zhang,et al.  Improving churn prediction in telecommunications using complementary fusion of multilayer features based on factorization and construction , 2014, The 26th Chinese Control and Decision Conference (2014 CCDC).

[47]  吕一旭 Yixu Lu 引言 (Introduction) , 2009, Provincial China.

[48]  Kenneth A. De Jong,et al.  A formal analysis of the role of multi-point crossover in genetic algorithms , 1992, Annals of Mathematics and Artificial Intelligence.

[49]  José Neves,et al.  Preventing Premature Convergence to Local Optima in Genetic Algorithms via Random Offspring Generation , 1999, IEA/AIE.