Classification of a DNA Microarray for Diagnosing Cancer Using a Complex Network Based Method

Applications that classify DNA microarray expression data are helpful for diagnosing cancer. Many attempts have been made to analyze these data; however, new methods are needed to obtain better results. In this study, a Complex Network (CN) classifier was exploited to implement the classification task. An algorithm was used to initialize the structure, which allowed input variables to be selected over layered connections and different activation functions for different nodes. Then, a hybrid method integrated the Genetic Programming and the Particle Swarm Optimization algorithms was used to identify an optimal structure with the parameters encoded in the classifier. The single CN classifier and an ensemble of CN classifiers were tested on four bench data sets. To ensure diversity of the ensemble classifiers, we constructed a base classifier using different feature sets, i.e., Pearson's correlation, Spearman's correlation, euclidean distance, Cosine coefficient, and the Fisher-ratio. The experimental results suggest that a single classifier can be used to obtain state-of-the-art results and the ensemble yielded better results.

[1]  Yuehui Chen,et al.  The dynamic mechanism of a novel stochastic neural firing pattern observed in a real biological system , 2019, Cognitive Systems Research.

[2]  Sung-Bae Cho,et al.  The classification of cancer based on DNA microarray data that uses diverse ensemble genetic programming , 2006, Artif. Intell. Medicine.

[3]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[4]  Zili Zhang,et al.  A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data , 2010, BMC Bioinformatics.

[5]  De-Shuang Huang,et al.  Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks , 2015, BMC Genomics.

[6]  J. Stuart Aitken,et al.  Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes , 2005, BMC Bioinformatics.

[7]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[8]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[9]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[10]  De-Shuang Huang,et al.  Improved performance in protein secondary structure prediction by combining multiple predictions. , 2006, Protein and peptide letters.

[11]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[12]  Lei Zhang,et al.  Prediction of protein-protein interactions based on protein-protein correlation using least squares regression. , 2014, Current protein & peptide science.

[13]  Yuehui Chen,et al.  A novel ensemble of classifiers for microarray data classification , 2008, Appl. Soft Comput..

[14]  Shi-Yuan Han,et al.  Fault diagnosis and fault-tolerant tracking control for discrete-time systems with faults and delays in actuator and measurement , 2017, J. Frankl. Inst..

[15]  De-Shuang Huang,et al.  A Two-Stage Geometric Method for Pruning Unreliable Links in Protein-Protein Networks , 2015, IEEE Transactions on NanoBioscience.

[16]  Massimiliano Zanin,et al.  Combining complex networks and data mining: why and how , 2016 .

[17]  Shi-Yuan Han,et al.  Approximation Optimal Vibration for Networked Nonlinear Vehicle Active Suspension with Actuator Time Delay , 2017 .

[18]  Yong Xu,et al.  Neuro-Fuzzy Ensemble Approach for Microarray Cancer Gene Expression Data Analysis , 2006, 2006 International Symposium on Evolving Fuzzy Systems.

[19]  Xing-Ming Zhao,et al.  APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility , 2010, BMC Bioinformatics.

[20]  Lei Zhang,et al.  Tumor Clustering Using Nonnegative Matrix Factorization With Gene Selection , 2009, IEEE Transactions on Information Technology in Biomedicine.

[21]  Saeed El-Ashram,et al.  Clustering by fast search and merge of local density peaks for gene expression microarray data , 2017, Scientific Reports.

[22]  Yanqing Zhang,et al.  Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis , 2007, TCBB.

[23]  Jian Huang,et al.  Regularized ROC method for disease classification and biomarker selection with microarray data , 2005, Bioinform..

[24]  D.-S. Huang,et al.  Radial Basis Probabilistic Neural Networks: Model and Application , 1999, Int. J. Pattern Recognit. Artif. Intell..

[25]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[26]  Matteo De Felice,et al.  Evolving Complex Neural Networks , 2007, AI*IA.

[27]  Wei Jia,et al.  Robust Classification Method of Tumor Subtype by Using Correlation Filters , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  De-Shuang Huang,et al.  Cancer classification using Rotation Forest , 2008, Comput. Biol. Medicine.

[29]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[30]  Y. Lai,et al.  Attacks and cascades in complex networks , 2004 .

[31]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[32]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[33]  De-Shuang Huang,et al.  A General CPL-AdS Methodology for Fixing Dynamic Parameters in Dual Environments , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[34]  Habibollah Haron,et al.  Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[35]  De-Shuang Huang,et al.  Normalized Feature Vectors: A Novel Alignment-Free Sequence Comparison Method Based on the Numbers of Adjacent Amino Acids , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[36]  De-Shuang Huang,et al.  Independent component analysis-based penalized discriminant method for tumor classification using gene expression data , 2006, Bioinform..

[37]  Cesare Furlanello,et al.  Entropy-based gene ranking without selection bias for the predictive classification of microarray data , 2003, BMC Bioinformatics.

[38]  Daniel Q. Naiman,et al.  Simple decision rules for classifying human cancers from gene expression profiles , 2005, Bioinform..

[39]  De-Shuang Huang,et al.  A Constructive Hybrid Structure Optimization Methodology for Radial Basis Probabilistic Neural Networks , 2008, IEEE Transactions on Neural Networks.

[40]  Zhu-Hong You,et al.  t-LSE: A Novel Robust Geometric Approach for Modeling Protein-Protein Interaction Networks , 2013, PloS one.

[41]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[42]  Zhen Wang,et al.  SFAPS: An R package for structure/function analysis of protein sequences based on informational spectrum method , 2013, 2013 IEEE International Conference on Bioinformatics and Biomedicine.