Probabilistic Methods in Cancer Biology

Recent advances in experimental techniques have made it possible to generate an enormous amount of ‘raw’ biological data, with cancer biology being no exception. The main challenge faced by cancer biologists now is the generation of plausible hypotheses that can be evaluated against available data and/or validated through further experimentation. For persons trained in control theory, there is now a significant opportunity to work with biologists to create a virtuous cycle of hypothesis generation and experimental validation. Given the large number of uncertain factors in any biological experiment, probabilistic methods are natural in this setting. In this paper, we discuss four specific problems in cancer biology that are amenable to study using probabilistic methods, namely: reverse engineering gene regulatory networks, constructing context-specific gene regulatory networks, analyzing the significance of expression levels for collections of genes, and discriminating between drivers (mutations that cause cancer) and passengers (mutations that are caused by cancer or have no impact). Some research problems that merit the attention of the controls community are also suggested.

[1]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[2]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[3]  M. Chalfie,et al.  Green fluorescent protein as a marker for gene expression. , 1994, Science.

[4]  Jun Miyoshi,et al.  K-Ras is essential for the development of the mouse embryo , 1997, Oncogene.

[5]  Kakajan Komurov,et al.  Functional parsing of driver mutations in the colorectal cancer genome reveals numerous suppressors of anchorage-independent growth. , 2011, Cancer research.

[6]  J. Khan,et al.  Database of mRNA gene expression profiles of multiple human organs. , 2005, Genome research.

[7]  Eduardo D Sontag,et al.  Network reconstruction based on steady-state data. , 2008, Essays in biochemistry.

[8]  Eduardo Sontag,et al.  Inference of signal transduction networks from double causal evidence. , 2010, Methods in molecular biology.

[9]  E. Seneta Non-negative Matrices and Markov Chains , 2008 .

[10]  Roberto Tempo,et al.  Distributed Randomized Algorithms for the PageRank Computation , 2010, IEEE Transactions on Automatic Control.

[11]  Gabriella Mavelli,et al.  Identification of Regulatory Network Motifs from Gene Expression Data , 2010, J. Math. Model. Algorithms.

[12]  D. Pe’er,et al.  An Integrated Approach to Uncover Drivers of Cancer , 2010, Cell.

[13]  Allan Balmain,et al.  Kras regulatory elements and exon 4A determine mutation specificity in lung cancer , 2008, Nature Genetics.

[14]  Mathukumalli Vidyasagar,et al.  A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems , 1997 .

[15]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[16]  Donald Geman,et al.  Identifying Tightly Regulated and Variably Expressed Networks by Differential Rank Conservation (DIRAC) , 2010, PLoS Comput. Biol..

[17]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[18]  Dongsheng Tu,et al.  K-ras mutations and benefit from cetuximab in advanced colorectal cancer. , 2008, The New England journal of medicine.

[19]  C. Sempi,et al.  Copula Theory: An Introduction , 2010 .

[20]  Mariano J. Alvarez,et al.  Genome-wide Identification of Post-translational Modulators of Transcription Factor Activity in Human B-Cells , 2009, Nature Biotechnology.

[21]  Frank Allgöwer,et al.  Introduction to the special issue on systems biology , 2011, Autom..

[22]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[23]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[24]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[25]  Michael A. White,et al.  Use of Data-Biased Random Walks on Graphs for the Retrieval of Context-Specific Networks from Genomic Data , 2010, PLoS Comput. Biol..

[26]  Aniruddha Datta,et al.  Systems biology Advance Access publication December 30, 2010 Cancer therapy design based on pathway logic , 2022 .

[27]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[28]  G. Parmigiani,et al.  The Consensus Coding Sequences of Human Breast and Colorectal Cancers , 2006, Science.

[29]  L. Lim,et al.  MicroRNA targeting specificity in mammals: determinants beyond seed pairing. , 2007, Molecular cell.

[30]  Eduardo D. Sontag,et al.  Reverse Engineering of Molecular Networks from a Common Combinatorial Approach , 2010, ArXiv.

[31]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[32]  E. Dougherty,et al.  Inferring Connectivity of Genetic Regulatory Networks Using Information-Theoretic Criteria , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[33]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Stephen P. Boyd,et al.  Inferring stable genetic networks from steady-state data , 2011, Autom..

[35]  Siddhartha Mukherjee,et al.  The Emperor of All Maladies , 2010 .

[36]  R. Tibshirani,et al.  On testing the significance of sets of genes , 2006, math/0610667.

[37]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[38]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[39]  F. Spitzer Markov Random Fields and Gibbs Ensembles , 1971 .

[40]  Mathukumalli Vidyasagar,et al.  Learning And Generalization , 2002 .

[41]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[42]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[43]  A. Sparks,et al.  The Genomic Landscapes of Human Breast and Colorectal Cancers , 2007, Science.

[44]  Nir Friedman,et al.  Context-specific Bayesian clustering for gene expression data , 2001, J. Comput. Biol..

[45]  Carmine Vecchione,et al.  Replacement of K‐Ras with H‐Ras supports normal embryonic development despite inducing cardiovascular pathology in adult mice , 2005, EMBO reports.

[46]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[47]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[48]  Claire J. Tomlin,et al.  Guest Editorial - Special Issue on Systems Biology , 2008, IEEE Trans. Autom. Control..

[49]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[50]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[51]  D. Pe’er,et al.  Principles and Strategies for Developing Network Models in Cancer , 2011, Cell.

[52]  Mathukumalli Vidyasagar,et al.  Learning and Generalization: With Applications to Neural Networks , 2002 .

[53]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[54]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[55]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[57]  J. L. Bos,et al.  ras oncogenes in human cancer: a review. , 1989, Cancer research.

[58]  Seungjin Choi,et al.  Principal network analysis: identification of subnetworks representing major dynamics using gene expression data , 2011, Bioinform..

[59]  Peng Qiu,et al.  Reducing the Computational Complexity of Information Theoretic Approaches for Reconstructing Gene Regulatory Networks , 2010, J. Comput. Biol..

[60]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[61]  R. Zamar,et al.  A multivariate Kolmogorov-Smirnov test of goodness of fit , 1997 .

[62]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[63]  A. Datta,et al.  From biological pathways to regulatory networks , 2010, 49th IEEE Conference on Decision and Control (CDC).

[64]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[65]  Gregory R. Grant,et al.  Statistical Methods in Bioinformatics , 2001 .