A Simplified Swarm Optimization For Discovering The Classification Rule Using Microarray Data Of Breast Cancer

Microarray data analysis is a major line of research in bioinformatics. A signicant trend in bioinformatics is identifying genes or gene groups that differentiate diseased tissues. Classication is necessary to make microarray data useful for applica- tion in medicine, and in related research such as disease diagnosis. Classication models have been developed using statistical methods such as logistic and multi-normal regression for data mining. However, the complexities of real-world classication problems, such as those in the medical domain, are highly dimensional. General statistical methods are inadequate for these complex problems. This study proposes simplied swarm optimiza- tion (SSO), an efficient methodology for discovering breast cancer classication rules. The data set was derived from the Stanford microarray database. The proposed approach enables simultaneous feature selection and pattern recognition. Experimental results in- dicate that SSO outperforms general data mining methods such as decision tree, neural network, support vector machine, etc. The proposed approach has potential applications in hospital decision-making and research such as predictive medicine.

[1]  Bernhard Pfeifer,et al.  Demoting redundant features to improve the discriminatory ability in cancer data , 2009, J. Biomed. Informatics.

[2]  Luca Scrucca,et al.  Class prediction and gene selection for DNA microarrays using regularized sliced inverse regression , 2007, Comput. Stat. Data Anal..

[3]  Yonghong Peng,et al.  A novel ensemble machine learning for robust microarray data classification , 2006, Comput. Biol. Medicine.

[4]  Wei-Chang Yeh,et al.  A two-stage discrete particle swarm optimization for the problem of multiple multi-level redundancy allocation in series systems , 2009, Expert Syst. Appl..

[5]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[6]  Giovanni Felici,et al.  Logic classification and feature selection for biomedical data , 2008, Comput. Math. Appl..

[7]  Wei-Chang Yeh,et al.  A new hybrid approach for mining breast cancer pattern using discrete particle swarm optimization and statistical method , 2009, Expert Syst. Appl..

[8]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[9]  Jie Li,et al.  A new classification model with simple decision rule for discovering optimal feature gene pairs , 2007, Comput. Biol. Medicine.

[10]  Ta-Cheng Chen,et al.  Using a hybrid meta-evolutionary rule mining approach as a classification response model , 2009, Expert Syst. Appl..

[11]  Marzuki Khalid,et al.  Function minimization in DNA sequence design based on continuous particle swarm optimization , 2009 .

[12]  Ji-Xiang Du,et al.  Microarray data classification based on ensemble independent component selection , 2009, Comput. Biol. Medicine.

[13]  Xiaonan Li,et al.  Operations research and data mining , 2008, Eur. J. Oper. Res..

[14]  Jiyuan An,et al.  Finding Rule Groups to Classify High Dimensional Gene Expression Datasets , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[15]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[16]  Hui-Ling Huang,et al.  ESVM: Evolutionary support vector machine for automatic feature selection and classification of microarray data , 2007, Biosyst..

[17]  Jiyuan An,et al.  Finding Rule Groups to Classify High Dimensional Gene Expression Datasets , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[18]  Novruz Allahverdi,et al.  Rule extraction from trained adaptive neural networks using artificial immune systems , 2009, Expert Syst. Appl..

[19]  Tiago Ferra de Sousa,et al.  Particle Swarm based Data Mining Algorithms for classification tasks , 2004, Parallel Comput..

[20]  Chen-Fu Chien,et al.  Cluster analysis of genome-wide expression data for feature extraction , 2009, Expert Syst. Appl..

[21]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[22]  Dursun Delen,et al.  Predicting breast cancer survivability: a comparison of three data mining methods , 2005, Artif. Intell. Medicine.

[23]  Yifan Chen,et al.  Time of Arrival Data Fusion Method for Two-Dimensional Ultrawideband Breast Cancer Detection , 2007, IEEE Transactions on Antennas and Propagation.

[24]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[25]  S Kaye New paradigms in the treatment of breast and colorectal cancer--an introduction. , 2002, European journal of cancer.

[26]  Hamid Mohamadi,et al.  Data mining with a simulated annealing based fuzzy classification system , 2008, Pattern Recognit..

[27]  Wenyuan Liu,et al.  Improved Particle Swarm Optimization Algorithm Based on Social Psychology , 2009, 2009 International Conference on Artificial Intelligence and Computational Intelligence.

[28]  Andrew Kusiak,et al.  Cancer gene search with data-mining and genetic algorithms , 2007, Comput. Biol. Medicine.

[29]  Soo-Hong Kim,et al.  Analysis of breast cancer using data mining & statistical techniques , 2005, Sixth International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing and First ACIS International Workshop on Self-Assembling Wireless Network.

[30]  Tao Yu,et al.  High-dimensional pseudo-logistic regression and classification with applications to gene expression data , 2007, Comput. Stat. Data Anal..

[31]  Ru-Sheng Liu,et al.  Pattern classification in DNA microarray data of multiple tumor types , 2006, Pattern Recognit..

[32]  Shinn-Ying Ho,et al.  Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers , 2007, Biosyst..

[33]  Sanghyun Park,et al.  Direct integration of microarrays for selecting informative genes and phenotype classification , 2008, Inf. Sci..