Partition-conditional ICA for Bayesian classification of microarray data

Accurate classification of microarray data is very important for medical decision making. Past studies have shown that class-conditional independent component analysis (CC-ICA) is capable of improving the performance of naive Bayes classifier in microarray data analysis. However, when a microarray dataset has a small number of samples for some classes, the application of CC-ICA may become infeasible. This paper extends CC-ICA and proposes a partition-conditional independent component analysis (PC-ICA) method for naive Bayes classification of microarray data. Compared to ICA and CC-ICA, PC-ICA represents an in-between concept for feature extraction. Our experimental results on two microarray datasets show that PC-ICA is more effective than ICA in improving the performance of naive Bayes classification of microarray data.

[1]  Li Shang,et al.  Feature selection in independent component subspace for microarray data classification , 2006, Neurocomputing.

[2]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[3]  Dino Isa,et al.  Using the self organizing map for clustering of text documents , 2009, Expert Syst. Appl..

[4]  Peng Zhou,et al.  A sequential feature extraction approach for naïve bayes classification of microarray data , 2009, Expert Syst. Appl..

[5]  Kim-Leng Poh,et al.  Improving the Naïve Bayes Classifier , 2009, Encyclopedia of Artificial Intelligence.

[6]  Sung-Nien Yu,et al.  Integration of independent component analysis and neural networks for ECG beat classification , 2008, Expert Syst. Appl..

[7]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[8]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[10]  Heather J. Ruskin,et al.  Techniques for clustering gene expression data , 2008, Comput. Biol. Medicine.

[11]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Jorng-Tzong Horng,et al.  An expert system to classify microarray gene expression data using gene selection by decision tree , 2009, Expert Syst. Appl..

[13]  Sung-Nien Yu,et al.  Selection of significant independent components for ECG beat classification , 2009, Expert Syst. Appl..

[14]  Houkuan Huang,et al.  Feature selection for text classification with Naïve Bayes , 2009, Expert Syst. Appl..

[15]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[16]  Jordi Vitrià,et al.  On the Selection and Classification of Independent Features , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Juan Gabriel Brida,et al.  Symbolic hierarchical analysis in currency markets: An application to contagion in currency crises , 2009, Expert Syst. Appl..

[18]  E. Oja,et al.  Independent Component Analysis , 2013 .

[19]  Tzu-Tsung Wong,et al.  Two-stage classification methods for microarray data , 2008, Expert Syst. Appl..

[20]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Ji-Xiang Du,et al.  Ensemble component selection for improving ICA based microarray data prediction models , 2009, Pattern Recognit..

[22]  Stuart C. Shapiro,et al.  Encyclopedia of artificial intelligence, vols. 1 and 2 (2nd ed.) , 1992 .

[23]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[24]  Aapo Hyvärinen,et al.  A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[25]  Barr and Feigenbaum Edward A. Avron,et al.  The Handbook of Artificial Intelligence , 1981 .

[26]  Chen-Fu Chien,et al.  Cluster analysis of genome-wide expression data for feature extraction , 2009, Expert Syst. Appl..

[27]  Jordi Vitrià,et al.  Bayesian Classification of Cork Stoppers Using Class-Conditional Independent Component Analysis , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[28]  Chung-Chian Hsu,et al.  Extended Naive Bayes classifier for mixed data , 2008, Expert Syst. Appl..

[29]  Sung-Bae Cho,et al.  Forward selection method with regression analysis for optimal gene selection in cancer classification , 2007, Int. J. Comput. Math..