Interpretation of an individual functional genomics experiment guided by massive public data

A key unmet challenge in interpreting omics experiments is inferring biological meaning in the context of public functional genomics data. We developed a computational framework, Your Evidence Tailored Integration (YETI; http://yeti.princeton.edu/), which creates specialized functional interaction maps from large public datasets relevant to an individual omics experiment. Using this tailored integration, we predicted and experimentally confirmed an unexpected divergence in viral replication after seasonal or pandemic human influenza virus infection.YETI puts individual -omics experiments in the context of public genomics data by creating an integrated dataset-specific functional network, thus allowing more thorough interpretation of the data.

[1]  David J. Arenillas,et al.  JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles , 2009, Nucleic Acids Res..

[2]  A. Brazma,et al.  Reuse of public genome-wide gene expression data , 2012, Nature Reviews Genetics.

[3]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[4]  M. Kanehisa,et al.  The KEGG databases and tools facilitating omics analysis: latest developments involving human diseases and pharmaceuticals. , 2012, Methods in molecular biology.

[5]  Lin Song,et al.  Comparison of co-expression measures: mutual information, correlation, and model based indices , 2012, BMC Bioinformatics.

[6]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[7]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2010, Nucleic Acids Res..

[8]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[9]  Olga G. Troyanskaya,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm332 Data and text mining , 2022 .

[10]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[11]  Jonathan D. Wren,et al.  A global meta-analysis of microarray expression data to predict unknown gene functions and estimate the literature-data divide , 2009, Bioinform..

[12]  S. Sealfon,et al.  Pandemic H1N1 influenza A viruses suppress immunogenic RIPK3-driven dendritic cell death , 2017, Nature Communications.

[13]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2012 update , 2011, Nucleic Acids Res..

[14]  Jeffrey T Leek,et al.  Reproducible RNA-seq analysis using recount2 , 2017, Nature Biotechnology.

[15]  Casey S. Greene,et al.  Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes , 2013, PLoS Comput. Biol..

[16]  Tanya Barrett,et al.  The Gene Expression Omnibus Database , 2016, Statistical Genomics.

[17]  Mona Singh,et al.  Computational solutions for omics data , 2013, Nature Reviews Genetics.

[18]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[19]  Olga G. Troyanskaya,et al.  The Sleipnir library for computational functional genomics , 2008, Bioinform..

[20]  Jennifer M. Rust,et al.  The BioGRID Interaction Database , 2011 .

[21]  Peter D. Karp,et al.  The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases , 2007, Nucleic Acids Res..

[22]  Kara Dolinski,et al.  Implications of Big Data for cell biology , 2015, Molecular biology of the cell.

[23]  G. Kaplan,et al.  The distinctive features of influenza virus infection of dendritic cells. , 1998, Immunobiology.

[24]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[25]  Matthew A. Hibbs,et al.  Exploring the human genome with functional maps. , 2009, Genome research.

[26]  Riet De Smet,et al.  Advantages and limitations of current network inference methods , 2010, Nature Reviews Microbiology.

[27]  Steven H. Kleinstein,et al.  Human Dendritic Cell Response Signatures Distinguish 1918, Pandemic, and Seasonal H1N1 Influenza Viruses , 2015, Journal of Virology.

[28]  Matthew A. Hibbs,et al.  Finding function: evaluation methods for functional genomic data , 2006, BMC Genomics.

[29]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[30]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[31]  Dmitrij Frishman,et al.  The MIPS mammalian protein?Cprotein interaction database , 2005, Bioinform..

[32]  Tommi S. Jaakkola,et al.  On the Dirichlet Prior and Bayesian Regularization , 2002, NIPS.

[33]  Allison P. Heath,et al.  Toward a Shared Vision for Cancer Genomic Data. , 2016, The New England journal of medicine.

[34]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[35]  R. Tibshirani,et al.  Comment on "Detecting Novel Associations In Large Data Sets" by Reshef Et Al, Science Dec 16, 2011 , 2014, 1401.7645.

[36]  Christopher Y. Park,et al.  Interactive Big Data Resource to Elucidate Human Immune Pathways and Diseases. , 2015, Immunity.

[37]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[38]  R. Tibshirani,et al.  A SIGNIFICANCE TEST FOR THE LASSO. , 2013, Annals of statistics.

[39]  Maria L. Rizzo,et al.  Brownian distance covariance , 2009, 1010.0297.

[40]  P. Brucker Review of recent development: An O( n) algorithm for quadratic knapsack problems , 1984 .

[41]  Homin K. Lee,et al.  Coexpression analysis of human genes across many microarray data sets. , 2004, Genome research.

[42]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Benjamin J. Raphael,et al.  Network propagation: a universal amplifier of genetic associations , 2017, Nature Reviews Genetics.

[44]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[45]  Boris M. Hartmann,et al.  Antiviral-Activated Dendritic Cells: A Paracrine-Induced Response State1 , 2008, The Journal of Immunology.

[46]  Oliver E. Sturm,et al.  RIPK3 Activates Parallel Pathways of MLKL-Driven Necroptosis and FADD-Mediated Apoptosis to Protect against Influenza A Virus. , 2016, Cell host & microbe.

[47]  Kenneth H. Buetow,et al.  20.453j / 2.771j / Hst.958j Biomedical Information Technology Pid: the Pathway Interaction Database , 2022 .

[48]  Franz Pernkopf,et al.  Bayesian network classifiers versus selective k-NN classifier , 2005, Pattern Recognit..

[49]  Daniel S. Himmelstein,et al.  Understanding multicellular function and disease with human tissue-specific networks , 2015, Nature Genetics.