Hierarchical Generative Biclustering for MicroRNA Expression Analysis

Clustering methods are a useful and common first step in gene expression studies, but the results may be hard to interpret We bring in explicitly an indicator of which genes tie each cluster, changing the setup to biclustering Furthermore, we make the indicators hierarchical, resulting in a hierarchy of progressively more specific biclusters A non-parametric Bayesian formulation makes the model rigorous and yet flexible, and computations feasible The formulation additionally offers a natural information retrieval relevance measure that allows relating samples in a principled manner We show that the model outperforms other four biclustering procedures in a large miRNA data set We also demonstrate the model's added interpretability and information retrieval capability in a case study that highlights the potential and novel role of miR-224 in the association between melanoma and non-Hodgkin lymphoma Software is publicly available.

[1]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[2]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[3]  F. Ionna,et al.  CD40 expressed on human melanoma cells mediates T cell co-stimulation and tumor cell growth. , 2000, International immunology.

[4]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[5]  Lars E. French,et al.  The TRAIL to selective tumor death , 1999, Nature Medicine.

[6]  Michael I. Jordan,et al.  A latent variable model for chemogenomic profiling , 2005, Bioinform..

[7]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  H Laurell,et al.  FIF [fibroblast growth factor-2 (FGF-2)-interacting-factor], a nuclear putatively antiapoptotic factor, interacts specifically with FGF-2. , 2000, Molecular endocrinology.

[9]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[10]  Thomas L. Griffiths,et al.  The nested Chinese restaurant process and Bayesian inference of topic hierarchies , 2007 .

[11]  Jean-Luc Poyet,et al.  The antiapoptotic protein AAC‐11 interacts with and regulates Acinus‐mediated DNA fragmentation , 2009, The EMBO journal.

[12]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[13]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[14]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[15]  D. Aldous Exchangeability and related topics , 1985 .

[16]  Thomas L. Griffiths,et al.  The Phylogenetic Indian Buffet Process: A Non-Exchangeable Nonparametric Prior for Latent Features , 2008, UAI.

[17]  Norbert Senninger,et al.  Involvement of CD40 Targeting miR-224 and miR-486 on the Progression of Pancreatic Ductal Adenocarcinomas , 2009, Annals of Surgical Oncology.

[18]  H Laurell,et al.  FGF-2 expression and its action in human leukemia and lymphoma cell lines , 2003, Leukemia.

[19]  C. Sander,et al.  A Mammalian microRNA Expression Atlas Based on Small RNA Library Sequencing , 2007, Cell.

[20]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[21]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[22]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[23]  Yan A. Su,et al.  Two types of human malignant melanoma cell lines revealed by expression patterns of mitochondrial and survival-apoptosis genes: implications for malignant melanoma therapy , 2009, Molecular Cancer Therapeutics.

[24]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[25]  Kuo-Bin Li,et al.  Profiling MicroRNA Expression in Hepatocellular Carcinoma Reveals MicroRNA-224 Up-regulation and Apoptosis Inhibitor-5 as a MicroRNA-224-specific Target* , 2008, Journal of Biological Chemistry.

[26]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[27]  Henry Tirri,et al.  A Scalable Topic-Based Open Source Search Engine , 2004, IEEE/WIC/ACM International Conference on Web Intelligence (WI'04).

[28]  M. Glennie,et al.  CD40 antibody evokes a cytotoxic T-cell response that eradicates lymphoma and bypasses T-cell help , 1999, Nature Medicine.

[29]  Tommi S. Jaakkola,et al.  Automated Discovery of Functional Generality of Human Gene Expression Programs , 2007, PLoS Comput. Biol..

[30]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[31]  Samuel Kaski,et al.  Probabilistic retrieval and visualization of biologically relevant microarray experiments , 2009, Bioinform..

[32]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[33]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[34]  Martin Reczko,et al.  The database of experimentally supported targets: a functional update of TarBase , 2008, Nucleic Acids Res..

[35]  L. Joseph 4. Bayesian data analysis (2nd edn). Andrew Gelman, John B. Carlin, Hal S. Stern and Donald B. Rubin (eds), Chapman & Hall/CRC, Boca Raton, 2003. No. of pages: xxv + 668. Price: $59.95. ISBN 1‐58488‐388‐X , 2004 .

[36]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[37]  C. Lee,et al.  MicroRNA and cancer – focus on apoptosis , 2008, Journal of cellular and molecular medicine.

[38]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[39]  M. Lens,et al.  An association between cutaneous melanoma and non-Hodgkin's lymphoma: pooled analysis of published data with a review. , 2005, Annals of oncology : official journal of the European Society for Medical Oncology.