Optimizing the early glaucoma detection from visual fields by combining preprocessing techniques and ensemble classifier with selection strategies

Abstract Artificial Intelligence is booming and many issues of research are being explored to improve technical performance in health systems. But also making them suitable for targeted medical practices. Their cost must also be justified by real added value for medical practitioners and patients. Extracting accurate information from data sets usually comes up against the amount of data and its distribution, which greatly affect the performance of the classifiers. Unbalanced classes or insignificant data features do not provide information for classifiers. Medical data like those of visual field (VF) most suffer from these problems. These factors limit the performance of individual classifiers. However, ensemble methods such as the bagging classifier (BC) can overcome these limitations and return good performances. BC is simple to process and very favorable to the combination with dynamic/static selection strategies (BC-DS/SS) which considerably improves its performance. By remaining sensitive to the problem of data distribution, this combination requires a fusion with pre-processing techniques such as feature selection and data rebalancing to be efficient. Thus, combining pre-processing techniques with the BC-DS/SS ensemble classifiers would allow to extract more accurate information from VF datasets. The stake of this classifier combining pre-processing techniques and ensemble methods with selection strategies named  C 2 P E M S 2 (C2 relates to Classifier Combining, PEM refers to Pre-processing and Ensemble Methods and S2 refers to Selection Strategies) consists of: (1) optimizing the performances while reducing the over-fitting, (2) saving in processing time and more importantly (3) predicting more efficiently the targeted class which often is the minority in unbalanced data sets. The experiments of our approach on VF datasets allowed to predict early glaucoma with greater efficiency compared to the state of the art.

[1]  José Salvador Sánchez,et al.  On the effectiveness of preprocessing methods when dealing with different levels of class imbalance , 2012, Knowl. Based Syst..

[2]  M. He,et al.  Efficacy of a Deep Learning System for Detecting Glaucomatous Optic Neuropathy Based on Color Fundus Photographs. , 2018, Ophthalmology.

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  Hassan Silkan,et al.  A data modeling approach for classification problems: application to bank telemarketing prediction , 2019, NISS19.

[5]  Perica Strbac,et al.  Toward optimal feature selection using ranking methods and classification algorithms , 2011 .

[6]  Anders Heijl,et al.  Trained Artificial Neural Network for Glaucoma Diagnosis Using Visual Field Data: A Comparison With Conventional Algorithms , 2007, Journal of glaucoma.

[7]  Juan José Rodríguez Diez,et al.  Random Balance: Ensembles of variable priors classifiers for imbalanced data , 2015, Knowl. Based Syst..

[8]  Terrence J. Sejnowski,et al.  Comparison of machine learning and traditional classifiers in glaucoma diagnosis , 2002, IEEE Transactions on Biomedical Engineering.

[9]  D C CavalcantiGeorge,et al.  Dynamic classifier selection , 2018 .

[10]  Youjie Zhou,et al.  Methods for testing the performance of long-distance wireless power transmission systems , 2020, EURASIP J. Wirel. Commun. Netw..

[11]  Joel S Schuman,et al.  Diagnostic tools for glaucoma detection and management. , 2008, Survey of ophthalmology.

[12]  Hiroshi Murata,et al.  Improving the Structure-Function Relationship in Glaucomatous Visual Fields by Using a Deep Learning-Based Noise Reduction Approach. , 2020, Ophthalmology. Glaucoma.

[13]  Samuel Burns Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow , 2019 .

[14]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[15]  Marek Kurzynski,et al.  A probabilistic model of classifier competence for dynamic ensemble selection , 2011, Pattern Recognit..

[16]  Joachim M. Buhmann,et al.  Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation , 2017, Comput. Medical Imaging Graph..

[17]  Gabriele Steidl,et al.  Combined SVM-Based Feature Selection and Classification , 2005, Machine Learning.

[18]  Zhonghui Dong,et al.  An Improved Oversampling Algorithm Based on the Samples’ Selection Strategy for Classifying Imbalanced Data , 2019, Mathematical Problems in Engineering.

[19]  J. Salmon,et al.  The role of scanning laser polarimetry using the GDx variable corneal compensator in the management of glaucoma suspects , 2006, British Journal of Ophthalmology.

[20]  Verónica Bolón-Canedo,et al.  Ensembles for feature selection: A review and future trends , 2019, Inf. Fusion.

[21]  Javier Del Ser,et al.  Data fusion and machine learning for industrial prognosis: Trends and perspectives towards Industry 4.0 , 2019, Inf. Fusion.

[22]  Habibollah Haron,et al.  Semi-supervised SVM-based Feature Selection for Cancer Classification using Microarray Gene Expression Data , 2015, IEA/AIE.

[23]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[24]  Qifa Xu,et al.  Imbalanced fault diagnosis of rotating machinery via multi-domain feature extraction and cost-sensitive learning , 2020, J. Intell. Manuf..

[25]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[26]  Hassan Silkan,et al.  Improvement in automated diagnosis of soft tissues tumors using machine learning , 2021, Big Data Min. Anal..

[27]  David P. Crabb,et al.  Exploring Early Glaucoma and the Visual Field Test: Classification and Clustering Using Bayesian Networks , 2014, IEEE Journal of Biomedical and Health Informatics.

[28]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Yu He,et al.  Statistical Significance of the Netflix Challenge , 2012, 1207.5649.

[30]  Francisco Herrera,et al.  ROSEFW-RF: The winner algorithm for the ECBDL'14 big data competition: An extremely imbalanced big data bioinformatics problem , 2015, Knowl. Based Syst..

[31]  Tien Yin Wong,et al.  Glaucoma detection based on deep convolutional neural network , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[32]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[33]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[34]  Robert N Weinreb,et al.  Using unsupervised learning with variational bayesian mixture of factor analysis to identify patterns of glaucomatous visual field defects. , 2004, Investigative ophthalmology & visual science.

[35]  Raphael Sznitman,et al.  A deep learning approach to automatic detection of early glaucoma from visual fields , 2018, PloS one.

[36]  El Arbi Abdellaoui Alaoui,et al.  Intelligent management of bike sharing in smart cities using machine learning and Internet of Things , 2021 .

[37]  Haibo He,et al.  RAMOBoost: Ranked Minority Oversampling in Boosting , 2010, IEEE Transactions on Neural Networks.

[38]  Lucy Q. Shen,et al.  Characterization of Central Visual Field Loss in End-stage Glaucoma by Unsupervised Artificial Intelligence. , 2020, JAMA ophthalmology.

[39]  Benayad Nsiri,et al.  Optimizing road traffic of emergency vehicles , 2013, 2013 International Conference on Advanced Logistics and Transport.

[40]  B. Thylefors,et al.  The global impact of glaucoma. , 1994, Bulletin of the World Health Organization.

[41]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[42]  George D. C. Cavalcanti,et al.  Dynamic classifier selection: Recent advances and perspectives , 2018, Inf. Fusion.

[43]  Luís Torgo,et al.  A Survey of Predictive Modeling on Imbalanced Domains , 2016, ACM Comput. Surv..

[44]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[45]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[46]  A Heijl,et al.  Glaucoma Hemifield Test. Automated visual field evaluation. , 1992, Archives of ophthalmology.

[47]  Anders Heijl,et al.  Comparison of clinicians and an artificial neural network regarding accuracy and certainty in performance of visual field assessment for the diagnosis of glaucoma , 2013, Acta ophthalmologica.

[48]  Hiroshi Fujita,et al.  Glaucoma risk assessment based on clinical data and automated nerve fiber layer defects detection , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[49]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[50]  H. Quigley Number of people with glaucoma worldwide. , 1996, The British journal of ophthalmology.

[51]  Daniel Hernández-Lobato,et al.  How large should ensembles of classifiers be? , 2013, Pattern Recognit..

[52]  M H Goldbaum,et al.  Interpretation of automated perimetry for glaucoma by neural network. , 1994, Investigative ophthalmology & visual science.

[53]  Bor-Chen Kuo,et al.  A Kernel-Based Feature Selection Method for SVM With RBF Kernel for Hyperspectral Image Classification , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[54]  Hassan Silkan,et al.  Optimizing the prediction of telemarketing target calls by a classification technique , 2018, 2018 6th International Conference on Wireless Networks and Mobile Communications (WINCOM).

[55]  Wisnu Jatmiko,et al.  Optimal Feature Aggregation and Combination for Two-Dimensional Ensemble Feature Selection , 2020, Inf..

[56]  Hossam Faris,et al.  Improving financial bankruptcy prediction in a highly imbalanced class distribution using oversampling and ensemble learning: a case from the Spanish market , 2019, Progress in Artificial Intelligence.

[57]  Samina Khalid,et al.  Review of Machine Learning techniques for glaucoma detection and prediction , 2014, 2014 Science and Information Conference.

[58]  Hari Wijayanto,et al.  Ensemble K-nearest neighbors method to predict rice price in Indonesia , 2014 .

[59]  Qun Dai,et al.  A competitive ensemble pruning approach based on cross-validation technique , 2013, Knowl. Based Syst..

[60]  U. Rajendra Acharya,et al.  Computer-aided diagnosis of glaucoma using fundus images: A review , 2018, Comput. Methods Programs Biomed..

[61]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[62]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[63]  Yijing Li,et al.  Learning from class-imbalanced data: Review of methods and applications , 2017, Expert Syst. Appl..

[64]  George D. C. Cavalcanti,et al.  A study on combining dynamic selection and data preprocessing for imbalance learning , 2018, Neurocomputing.

[65]  Xin Yao,et al.  MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning , 2014 .

[66]  N. P. Ananthamoorthy,et al.  Glaucoma classification based on intra-class and extra-class discriminative correlation and consensus ensemble classifier. , 2020, Genomics.

[67]  Tzyy-Ping Jung,et al.  Glaucoma Progression Detection Using Structural Retinal Nerve Fiber Layer Measurements and Functional Visual Field Points , 2014, IEEE Transactions on Biomedical Engineering.

[68]  Hassan Silkan,et al.  Using Deep Features Extraction and Ensemble Classifiers to Detect Glaucoma from Fundus Images , 2021 .

[69]  Christopher Bowd,et al.  Machine Learning Classifiers in Glaucoma , 2008, Optometry and vision science : official publication of the American Academy of Optometry.

[70]  Yirui Wu,et al.  SMOTE-Boost-based sparse Bayesian model for flood prediction , 2020, EURASIP Journal on Wireless Communications and Networking.

[71]  R. Webb,et al.  Confocal scanning laser ophthalmoscope. , 1987, Applied optics.

[72]  J. Fujimoto,et al.  Optical Coherence Tomography , 1991 .

[73]  Hassan Silkan,et al.  Improving parking availability prediction in smart cities with IoT and ensemble-based model , 2020, J. King Saud Univ. Comput. Inf. Sci..

[74]  Dada Emmanuel Gbenga,et al.  Understanding the Limitations of Particle Swarm Algorithm for Dynamic Optimization Tasks , 2016, ACM Comput. Surv..

[75]  Pushpak Bhattacharyya,et al.  Feature selection and ensemble construction: A two-step method for aspect based sentiment analysis , 2017, Knowl. Based Syst..

[76]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[77]  Hiroshi Fujita,et al.  A development of computer-aided diagnosis system using fundus images , 2001, Proceedings Seventh International Conference on Virtual Systems and Multimedia.

[78]  Xiaoyi Jiang,et al.  Dynamic classifier ensemble model for customer classification with imbalanced class distribution , 2012, Expert Syst. Appl..

[79]  Hamza Toulni,et al.  Improving KNN Model for Direct Marketing Prediction in Smart Cities , 2021 .

[80]  Koenraad A Vermeer,et al.  Robust and censored modeling and prediction of progression in glaucomatous visual fields. , 2013, Investigative ophthalmology & visual science.

[81]  Xiaofeng Zhu,et al.  Efficient kNN Classification With Different Numbers of Nearest Neighbors , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[82]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[83]  Vladimir Vapnik,et al.  Support-vector networks , 2004, Machine Learning.

[84]  Antônio de Pádua Braga,et al.  Novel Cost-Sensitive Approach to Improve the Multilayer Perceptron Performance on Imbalanced Data , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[85]  Koenraad A Vermeer,et al.  Accuracy assessment of intra- and intervisit fundus image registration for diabetic retinopathy screening. , 2015, Investigative ophthalmology & visual science.

[86]  Hassan Silkan,et al.  Machine Learning Aprroach for Early Detection of Glaucoma from Visual Fields , 2020, NISS.

[87]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[88]  George D. C. Cavalcanti,et al.  DESlib: A Dynamic ensemble selection library in Python , 2018, J. Mach. Learn. Res..

[89]  Koenraad A Vermeer,et al.  Optimizing structure-function relationship by maximizing correspondence between glaucomatous visual fields and mathematical retinal nerve fiber models. , 2014, Investigative ophthalmology & visual science.