Microarray Data Biclustering with Multi-objective Immune Optimization Algorithm

High throughput technologies yield large-scale datasets on genomic variation in diverse populations, allowing the study of these variations and their association with disease and their complex traits. Systematic functional characterization of genes identified in the genome sequencing projects is urgently needed in the post-genomic era. Biclustering, which searches for subsets of individuals that are coherent in their behavior across a subset of the features, is a very useful data mining technique in microarray data analysis and has presented its advantages in many applications. This paper proposes a novel multi-objective immune biclustering (MOIB) algorithm, based on the immune response principle of the immune system, to mine biclusters from microarray data.In the algorithm, we extends ε-dominance and performs the mechanism of crowding computation to obtain many Pareto optimal solutions distributed onto the Pareto front. Experimental results on real datasets show that our approach can effectively find more significant biclusters than other biclustering algorithms.

[1]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[2]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[3]  Bart De Moor,et al.  Biclustering microarray data by Gibbs sampling , 2003, ECCB.

[4]  Kalyanmoy Deb,et al.  Improved Pruning of Non-Dominated Solutions Based on Crowding Distance for Bi-Objective Optimization Problems , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[5]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[6]  Alan S. Perelson,et al.  The immune system, adaptation, and machine learning , 1986 .

[7]  Kazuyuki Mori,et al.  Application of an immune algorithm to multi-optimization problems , 1998 .

[8]  Fabrício Olivetti de França,et al.  A Multi-Objective Multipopulation Approach for Biclustering , 2008, ICARIS.

[9]  P. Hajela,et al.  Immune network simulations in multicriterion design , 1999 .

[10]  Kevin P. Anchor,et al.  Extending the Computer Defense Immune System : Network Intrusion Detection with a Multiobjective Evolutionary Programming Approach , 2002 .

[11]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[12]  Padraig Cunningham,et al.  Biclustering of expression data using simulated annealing , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[13]  Fabrício Olivetti de França,et al.  Applying Biclustering to Text Mining: An Immune-Inspired Approach , 2007, ICARIS.

[14]  Kathleen Marchal,et al.  Adaptive quality-based clustering of gene expression profiles , 2002, Bioinform..

[15]  Philip S. Yu,et al.  Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[16]  Fernando José Von Zuben,et al.  Learning and optimization using the clonal selection principle , 2002, IEEE Trans. Evol. Comput..

[17]  Licheng Jiao,et al.  A novel genetic algorithm based on immunity , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[18]  Sushmita Mitra,et al.  Multi-objective evolutionary biclustering of gene expression data , 2006, Pattern Recognit..

[19]  R. Gershon,et al.  "Clonal selection and after," and after. , 1979, The New England journal of medicine.

[20]  Federico Divina,et al.  A multi-objective approach to discover biclusters in microarray data , 2007, GECCO '07.

[21]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[22]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[23]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Carlos A. Coello Coello,et al.  Solving Multiobjective Optimization Problems Using an Artificial Immune System , 2005, Genetic Programming and Evolvable Machines.

[25]  Eckart Zitzler,et al.  An EA framework for biclustering of gene expression data , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[26]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[27]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Jonathan Timmis,et al.  Artificial Immune Systems : Using the Immune System as Inspiration for Data Mining , 2001 .

[29]  Prospero C. Naval,et al.  An effective use of crowding distance in multiobjective particle swarm optimization , 2005, GECCO '05.

[30]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.