A differential biclustering algorithm for comparative analysis of gene expression

Convergences and divergences among related organisms (S.cerevisiae and C.albicans for example) or same organisms (healthy and disease tissues for example) can often be traced to the differential expression of specific group of genes. Yet, algorithms to characterize such differences and similarities using gene expression data are not well developed. Given two related organisms A and B, we introduce and develop a differential biclustering algorithm, that aims at finding convergent biclusters, divergent biclusters, partially conserved biclusters, and split conserved biclusters. A convergent bicluster is a group of genes with similar functions that are conserved in A and B. A divergent bicluster is a group of genes with similar function in A (or B) but which play different role in B (or A). Partially conserved biclusters and split conserved biclusters capture more complicated relationships between the behavior and functions of the genes in A and B. Uncovering such patterns can elucidate new insides about how related organisms have evolved or the role played by some group of genes during the development of some diseases. Our differential biclustering algorithm consists of two steps. The first step consists of using a parallel biclustering algorithm to uncover all valid biclusters with coherent evolutions in each set of data. The second step consists of performing a differential analysis on the set of biclusters identified in step one, yielding sets of convergent, divergent, partially conserved and split conserved biclusters.

[1]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  F. Klis,et al.  Granulocytes govern the transcriptional response, morphology and proliferation of Candida albicans in human blood , 2005, Molecular microbiology.

[3]  S. Bergmann,et al.  Similarities and Differences in Genome-Wide Expression Data of Six Organisms , 2003, PLoS biology.

[4]  Cornelia I Bargmann,et al.  Comparing genomic expression patterns across species identifies shared transcriptional profile in aging , 2004, Nature Genetics.

[5]  Alix T. Coste,et al.  Comparison of Gene Expression Profiles of Candida albicans Azole-Resistant Clinical Isolates and Laboratory Strains Exposed to Drugs Inducing Multidrug Transporters , 2004, Antimicrobial Agents and Chemotherapy.

[6]  S. Bergmann,et al.  Comparative Gene Expression Analysis by a Differential Clustering Approach: Application to the Candida albicans Transcription Program , 2005, PLoS genetics.

[7]  D. Botstein,et al.  Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Ahmed H. Tewfik,et al.  Parallel identification of gene biclusters with coherent evolutions , 2006, IEEE Transactions on Signal Processing.

[9]  George Newport,et al.  The diploid genome sequence of Candida albicans. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Amy P N Skubitz,et al.  Differential gene expression in ovarian carcinoma: identification of potential biomarkers. , 2004, The American journal of pathology.

[11]  M. Whiteway,et al.  Stress-induced gene expression in Candida albicans: absence of a general stress response. , 2003, Molecular biology of the cell.

[12]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[13]  M. Uhl,et al.  Identification and Characterization of a Candida albicans Mating Pheromone , 2003, Molecular and Cellular Biology.

[14]  S. Hedges,et al.  Molecular Evidence for the Early Colonization of Land by Fungi and Plants , 2001, Science.

[15]  Gerald R. Fink,et al.  Transcriptional Response of Candida albicans upon Internalization by Macrophages , 2004, Eukaryotic Cell.