A COMPARATIVE STUDY OF CLUSTERING AND BICLUSTERING OF MICROARRAY DATA

There are subsets of genes that have similar behavior under subsets of conditions, so we say that they coexpress, but behave independently under other subsets of conditions. Discovering such coexpressions can be helpful to uncover genomic knowledge such as gene networks or gene interactions. That is why, it is of utmost importance to make a simultaneous clustering of genes and conditions to identify clusters of genes that are coexpressed under clusters of conditions. This type of clustering is called biclustering. Biclustering is an NP-hard problem. Consequently, heuristic algorithms are typically used to approximate this problem by finding suboptimal solutions. In this paper, we make a new survey on clustering and biclustering of gene expression data, also called microarray data.

[1]  Roberto Therón,et al.  BicOverlapper: A tool for bicluster visualization , 2008, Bioinform..

[2]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Martin Sill,et al.  Robust biclustering by sparse singular value decomposition incorporating stability selection , 2011, Bioinform..

[4]  Gérard Govaert,et al.  A Comparison Between Block CEM and Two-Way CEM Algorithms to Cluster a Contingency Table , 2005, PKDD.

[5]  I. Dhillon,et al.  Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Wan Baocheng,et al.  The Implementation of Parallel Genetic Algorithm Based on MATLAB , 2007, APPT.

[7]  Li Teng,et al.  Discovering Biclusters by Iteratively Sorting with Weighted Correlation Coefficient in Gene Expression Data , 2008, J. Signal Process. Syst..

[8]  Chris H. Q. Ding,et al.  Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence Chi-Square Statistic, and a Hybrid Method , 2006, AAAI.

[9]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[10]  Samuel Kaski,et al.  Hierarchical Generative Biclustering for MicroRNA Expression Analysis , 2011, J. Comput. Biol..

[11]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[12]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[13]  Dietrich Lehmann,et al.  Nonsmooth nonnegative matrix factorization (nsNMF) , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Arlindo L. Oliveira,et al.  A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series , 2009, Algorithms for Molecular Biology.

[15]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[16]  Malika Charrad Une approche générique pour l'analyse croisant contenu et usage des sites Web par des méthodes de bipartitionnement. (A generic approach to combining web content and usage analysis using biclustering algorithms) , 2010 .

[17]  Mehmet Koyutürk,et al.  Using Protein Interaction Networks to Understand Complex Diseases , 2012, Computer.

[18]  Philip S. Yu,et al.  An Improved Biclustering Method for Analyzing Gene Expression Profiles , 2005, Int. J. Artif. Intell. Tools.

[19]  Mohamed A. Ismail,et al.  BISOFT: A Semi-Fuzzy Approach For BiClustering Gene Expression Data , 2008, BIOCOMP.

[20]  A. Quiroz,et al.  A fast permutation-based algorithm for block clustering , 1997 .

[21]  Jin-Kao Hao,et al.  BicFinder: a biclustering algorithm for microarray data analysis , 2012, Knowledge and Information Systems.

[22]  Saeid Nahavandi,et al.  Spike sorting using locality preserving projection with gap statistics and landmark-based spectral clustering , 2014, Journal of Neuroscience Methods.

[23]  Roberto Therón,et al.  A visual analytics approach for understanding biclustering results from microarray data , 2008, BMC Bioinformatics.

[24]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[25]  Philip S. Yu,et al.  Measure the Semantic Similarity of GO Terms Using Aggregate Information Content , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[26]  Eugenio Cesario,et al.  Random walk biclustering for microarray data , 2008, Inf. Sci..

[27]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Ron Shamir,et al.  EXPANDER – an integrative program suite for microarray data analysis , 2005, BMC Bioinformatics.

[29]  Akdes Serin Biclustering analysis for large scale data , 2012 .

[30]  Ron Shamir,et al.  Clustering Gene Expression Patterns , 1999, J. Comput. Biol..

[31]  Jesús S. Aguilar-Ruiz,et al.  Shifting and scaling patterns from gene expression data , 2005, Bioinform..

[32]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[33]  Mohamed Ben Ahmed,et al.  Détermination du nombre des classes dans l'algorithme CROKI de classification croisée , 2009, EGC.

[34]  Gérard Govaert,et al.  Block clustering via the block GEM and two-way EM algorithms , 2005, The 3rd ACS/IEEE International Conference onComputer Systems and Applications, 2005..

[35]  Alan Wee-Chung Liew,et al.  Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization , 2008, BMC Bioinformatics.

[36]  Aedín C. Culhane,et al.  iBBiG: iterative binary bi-clustering of gene sets , 2012, Bioinform..

[37]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[38]  Rui Xu,et al.  BARTMAP: A viable structure for biclustering , 2011, Neural Networks.

[39]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[40]  Jesús S. Aguilar-Ruiz,et al.  A biclustering algorithm for extracting bit-patterns from binary datasets , 2011, Bioinform..

[41]  Jin-Kao Hao,et al.  A biclustering algorithm based on a Bicluster Enumeration Tree: application to DNA microarray data , 2009, BioData Mining.

[42]  Gérard Govaert La classification croisée , 1989, Monde des Util. Anal. Données.

[43]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[44]  Lodewyk F. A. Wessels,et al.  Biclustering Sparse Binary Genomic Data , 2008, J. Comput. Biol..

[45]  Sushmita Mitra,et al.  Evolutionary Biclustering with Correlation for Gene Interaction Networks , 2007, PReMI.

[46]  Mao Lin Huang,et al.  Optimized data acquisition by time series clustering in OPC , 2011, 2011 6th IEEE Conference on Industrial Electronics and Applications.

[47]  Aidong Zhang,et al.  Interrelated two-way clustering: an unsupervised approach for gene expression data analysis , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[48]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[49]  Albert Y. Zomaya,et al.  Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications , 2011 .

[50]  Khalid Benabdeslem,et al.  Bi-clustering continuous data with self-organizing map , 2012, Neural Computing and Applications.

[51]  Mario Cannataro,et al.  Mining Association Rules from Gene Ontology and Protein Networks: Promises and Challenges , 2014, ICCS.

[52]  Ye-In Chang,et al.  A Condition-Enumeration Tree method for mining biclusters from DNA microarray data sets , 2009, Biosyst..

[53]  Joana P. Gonçalves,et al.  e-BiMotif: Combining Sequence Alignment and Biclustering to Unravel Structured Motifs , 2010, IWPACBB.

[54]  Lusheng Wang,et al.  Computing the maximum similarity bi-clusters of gene expression data , 2007, Bioinform..

[55]  Wilhelm Gruissem,et al.  Exact biclustering algorithm for the analysis of large gene expression data sets , 2012, BMC Bioinformatics.

[56]  Xirong Li,et al.  Mapping Query to Semantic Concepts: Leveraging Semantic Indices for Automatic and Interactive Video Retrieval , 2007 .

[57]  George Michailidis,et al.  Biclustering Three-Dimensional Data Arrays With Plaid Models , 2014 .

[58]  Joel Arrais,et al.  Large Scale Comparative Codon-Pair Context Analysis Unveils General Rules that Fine-Tune Evolution of mRNA Primary Structure , 2007, PloS one.

[59]  Morten Nielsen,et al.  Simultaneous alignment and clustering of peptide data using a Gibbs sampling approach , 2013, Bioinform..

[60]  Amedeo Napoli,et al.  Mining Biclusters of Similar Values with Triadic Concept Analysis , 2011, CLA.