Comparative Study on Swarm Intelligence Techniques for Biclustering of Microarray Gene Expression Data

Microarray gene expression data play a vital in biological processes, gene regulation and disease mechanism. Biclustering in gene expression data is a subset of the genes indicating consistent patterns under the subset of the conditions. Finding a biclustering is an optimization problem. In recent years, swarm intelligence techniques are popular due to the fact that many real-world problems are increasingly large, complex and dynamic. By reasons of the size and complexity of the problems, it is necessary to find an optimization technique whose efficiency is measured by finding the near optimal solution within a reasonable amount of time. In this paper, the algorithmic concepts of the Particle Swarm Optimization (PSO), Shuffled Frog Leaping (SFL) and Cuckoo Search (CS) algorithms have been analyzed for the four benchmark gene expression dataset. The experiment results show that CS outperforms PSO and SFL for 3 datasets and SFL give better performance in one dataset. Also this work determines the biological relevance of the biclusters with Gene Ontology in terms of function, process and component. Keywords—Particle swarm optimization, Shuffled frog leaping, Cuckoo search, biclustering, gene expression data.

[1]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[2]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[3]  Fabrício Olivetti de França,et al.  Multi-Objective Biclustering: When Non-dominated Solutions are not Enough , 2009, J. Math. Model. Algorithms.

[4]  Federico Divina,et al.  Biclustering of expression data with evolutionary computation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[6]  Kevin E Lansey,et al.  Optimization of Water Distribution Network Design Using the Shuffled Frog Leaping Algorithm , 2003 .

[7]  C. L. Liu,et al.  Introduction to Combinatorial Mathematics. , 1971 .

[8]  W. Marsden I and J , 2012 .

[9]  E. Wilson,et al.  Sociobiology: The New Synthesis , 1975 .

[10]  Zhoujun Li,et al.  Biclustering of microarray data with MOSPO based on crowding distance , 2009, BMC Bioinformatics.

[11]  T. M. Murali,et al.  Extracting Conserved Gene Expression Motifs from Gene Expression Data , 2002, Pacific Symposium on Biocomputing.

[12]  Xuelong Li,et al.  Parallelized Evolutionary Learning for Detection of Biclusters in Gene Expression Data , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  David K. Smith Theory of Linear and Integer Programming , 1987 .

[14]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[15]  Christodoulos A. Floudas,et al.  Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies , 2008, BMC Bioinformatics.

[16]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[17]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[18]  Jugal K. Kalita,et al.  CoBi: Pattern Based Co-Regulated Biclustering of Gene Expression Data , 2013, Pattern Recognit. Lett..

[19]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[20]  Sven Bergmann,et al.  Iterative signature algorithm for the analysis of large-scale gene expression data. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Eckart Zitzler,et al.  An EA framework for biclustering of gene expression data , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[22]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[23]  Pascal Nsoh,et al.  Large-scale temporal gene expression mapping of central nervous system development , 2007 .

[24]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[26]  Basilis Boutsinas,et al.  A New Biclustering Algorithm Based on Association Rule Mining , 2013, Int. J. Artif. Intell. Tools.

[27]  Muzaffar Eusuff,et al.  Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization , 2006 .

[28]  Xin-She Yang,et al.  Engineering optimisation by cuckoo search , 2010 .

[29]  S. Sorooshian,et al.  Shuffled complex evolution approach for effective and efficient global minimization , 1993 .

[30]  Saharon Rosset,et al.  Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[31]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[32]  Mordecai Avriel,et al.  Nonlinear programming , 1976 .

[33]  Sushmita Mitra,et al.  Multi-objective evolutionary biclustering of gene expression data , 2006, Pattern Recognit..

[34]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[35]  Lusheng Wang,et al.  Computing the maximum similarity bi-clusters of gene expression data , 2007, Bioinform..