Artificial Neural Network Classification of High Dimensional Data with Novel Optimization Approach of Dimension Reduction

Classification of high dimensional data is a very crucial task in bioinformatics. Cancer classification of the microarray is a typical application of machine learning due to the large numbers of genes. Feature (genes) selection and classification with computational intelligent techniques play an important role in diagnosis and prediction of disease in the microarray. Artificial neural networks (ANN) is an artificial intelligence technique for classifying, image processing and predicting the data. This paper evaluates the performance of ANN classifier using six different hybrid feature selection techniques, for gene selection of microarray data. These hybrid techniques use Independent component analysis (ICA), as an extraction technique, popular filter techniques and bio-inspired algorithm for optimization of the ICA feature vector. Five binary gene expression microarray datasets are used to compare the performance of these techniques and determine how these techniques improve the performance of ANN classifier. These techniques can be extremely useful in feature selection because they achieve the highest classification accuracy along with the lowest average number of selected genes. Furthermore, to check the significant difference between these different algorithms a statistical hypothesis test was employed with a certain level of confidence. The experimental result shows that a combination of ICA with genetic bee colony algorithm shows superior performance as it heuristically removes non-contributing features to improve the performance of classifiers.

[1]  Dong-Ling Tong,et al.  Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data , 2011, Artif. Intell. Medicine.

[2]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[3]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[4]  John McDaniel,et al.  The Effects of Cold and Lower Body Negative Pressure on Cardiovascular Homeostasis , 2015, BioMed research international.

[5]  Abhilash Mohan,et al.  Automatic classification of protein structures using physicochemical parameters , 2014, Interdisciplinary Sciences: Computational Life Sciences.

[6]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[7]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[8]  Jian Pei,et al.  Introduction to the special issue on data mining for health informatics , 2007, SKDD.

[9]  Turan Paksoy,et al.  A novel hybrid approach based on Particle Swarm Optimization and Ant Colony Algorithm to forecast energy demand of Turkey , 2012 .

[10]  Driss Aboutajdine,et al.  A New gene selection approach based on Minimum Redundancy-Maximum Relevance (MRMR) and Genetic Algorithm (GA) , 2009, 2009 IEEE/ACS International Conference on Computer Systems and Applications.

[11]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[12]  Reza Ghaeini,et al.  A Deep Learning Approach for Cancer Detection and Relevant Gene Identification , 2017, PSB.

[13]  Namita Srivastava,et al.  A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data , 2016, Genomics data.

[14]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[15]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[16]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[17]  Chun-Chin Hsu,et al.  Integrating independent component analysis and support vector machine for multivariate process monitoring , 2010, Comput. Ind. Eng..

[18]  Anirban Mukherjee,et al.  Cancer Classification from Gene Expression Data by NPPC Ensemble , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  César Hervás-Martínez,et al.  Evolutionary Generalized Radial Basis Function neural networks for improving prediction accuracy in gene classification using feature selection , 2012, Appl. Soft Comput..

[20]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[21]  Namita Srivastava,et al.  Artificial neural network classification of microarray data using new hybrid gene selection method , 2017, Int. J. Data Min. Bioinform..

[22]  Zhengrong Liang,et al.  ROC operating point selection for classification of imbalanced data with application to computer-aided polyp detection in CT colonography , 2013, International Journal of Computer Assisted Radiology and Surgery.

[23]  Jin-Kao Hao,et al.  A Hybrid GA/SVM Approach for Gene Selection and Classification of Microarray Data , 2006, EvoWorkshops.

[24]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[25]  Chih-Ming Chen,et al.  An efficient fuzzy classifier with feature selection based on fuzzy entropy , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[26]  E. Oja,et al.  Independent Component Analysis , 2013 .

[27]  Barnali Sahu,et al.  A Novel Feature Selection Algorithm using Particle Swarm Optimization for Cancer Microarray Data , 2012 .

[28]  Hala Alshamlan,et al.  mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling , 2015, BioMed research international.

[29]  Ahmad M. Sarhan,et al.  Journal of Theoretical and Applied Information Technology Cancer Classification Based on Microarray Gene Expression Data Using Dct and Ann , 2022 .

[30]  Namita Srivastava,et al.  A novel approach for dimension reduction of microarray , 2017, Comput. Biol. Chem..

[31]  Jing Zhao,et al.  ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data , 2013, Neurocomputing.

[32]  Beatriz A. Garro,et al.  Classification of DNA microarrays using artificial neural networks and ABC algorithm , 2016, Appl. Soft Comput..

[33]  Yonghong Peng,et al.  A novel ensemble machine learning for robust microarray data classification , 2006, Comput. Biol. Medicine.

[34]  Jing Yin,et al.  Artificial neural networks and gene filtering distinguish between global gene expression profiles of Barrett's esophagus and esophageal cancer. , 2002, Cancer research.

[35]  C. Verma,et al.  Dimension reduction methods for microarray data: a review , 2017 .

[36]  Qiang Shen,et al.  Aiding classification of gene expression data with feature selection: a comparative study , 2005 .

[37]  Ishtiaq Rehman,et al.  The application of artificial intelligence to microarray data: Identification of a novel gene signature to identify bladder cancer progression , 2014 .

[38]  Mustafa Ozen,et al.  Artificial Neural Network Analysis of DNA Microarray-based Prostate Cancer Recurrence , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[39]  Zhaohui S. Qin,et al.  Exploring the Cooccurrence Patterns of Multiple Sets of Genomic Intervals , 2013, BioMed research international.

[40]  Supoj Hengpraprohm GA-Based Classifier with SNR Weighted Features for Cancer Microarray Data Classification , 2013, SiPS 2013.

[41]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[42]  James Nga-Kwok Liu,et al.  An elastic contour matching model for tropical cyclone pattern recognition , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[43]  A. Rajasekhar,et al.  Speed control of PMSM by hybrid genetic Artificial Bee Colony Algorithm , 2010, 2010 INTERNATIONAL CONFERENCE ON COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES.

[44]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[45]  Rasmita Dash,et al.  A two stage grading approach for feature selection and classification of microarray data using Pareto based feature ranking techniques: A case study , 2017, J. King Saud Univ. Comput. Inf. Sci..

[46]  Christophe Lemetre,et al.  An introduction to artificial neural networks in bioinformatics - application to complex microarray and mass spectrometry datasets in cancer studies , 2008, Briefings Bioinform..

[47]  Geoffrey A. Solano,et al.  Cluster center genes as candidate biomarkers for the classification of Leukemia , 2014, IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications.

[48]  Hala M. Alshamlan,et al.  The Performance of Bio-Inspired Evolutionary Gene Selection Methods for Cancer Classification Using Microarray Dataset , 2014 .

[49]  Xibei Yang,et al.  Recognition of Multiple Imbalanced Cancer Types Based on DNA Microarray Data Using Ensemble Classifiers , 2013, BioMed research international.