L1000CDS2: LINCS L1000 characteristic direction signatures search engine

The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS2. The L1000CDS2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS2, we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS2 tool can be applied in many biological and biomedical settings, while improving the extraction of knowledge from the LINCS L1000 resource.

[1]  Lixia Ding,et al.  Inhibition of CDK2 promotes inducible regulatory T-cell differentiation through TGFβ-Smad3 signaling pathway. , 2014, Cellular immunology.

[2]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[3]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[4]  Avi Ma'ayan,et al.  Dynamics of the discovery process of protein-protein interactions from low content studies , 2015, BMC Systems Biology.

[5]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[6]  Avi Ma'ayan,et al.  KEA: kinase enrichment analysis , 2009, Bioinform..

[7]  Simon Anders,et al.  Analysing RNA-Seq data with the DESeq package , 2011 .

[8]  P. Bork,et al.  Drug Target Identification Using Side-Effect Similarity , 2008, Science.

[9]  D. Burton,et al.  Ebola Virion Attachment and Entry into Human Macrophages Profoundly Effects Early Cellular Gene Expression , 2011, PLoS neglected tropical diseases.

[10]  Sina Bavari,et al.  Reduced levels of protein tyrosine phosphatase CD45 protect mice from the lethal effects of Ebola virus infection. , 2009, Cell host & microbe.

[11]  M. Schroeder,et al.  Drug target prioritization by perturbed gene expression and network information , 2015, Scientific Reports.

[12]  Kathleen H Rubins,et al.  The temporal program of peripheral blood gene expression in the response of nonhuman primates to Ebola hemorrhagic fever , 2007, Genome Biology.

[13]  P. Cohen,et al.  The specificities of protein kinase inhibitors: an update. , 2003, The Biochemical journal.

[14]  Rafael A. Irizarry,et al.  Bioinformatics and Computational Biology Solutions using R and Bioconductor , 2005 .

[15]  Avi Ma'ayan,et al.  Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool , 2013, BMC Bioinformatics.

[16]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[17]  Avi Ma'ayan,et al.  Expression2Kinases: mRNA profiling linked to multiple upstream regulatory layers , 2012, Bioinform..

[18]  Rainer Breitling,et al.  Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments , 2004, FEBS letters.

[19]  Avi Ma'ayan,et al.  The characteristic direction: a geometrical approach to identify differentially expressed genes , 2014, BMC Bioinformatics.

[20]  J. Darnell,et al.  Stat3: a STAT family member activated by tyrosine phosphorylation in response to epidermal growth factor and interleukin-6. , 1994, Science.

[21]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[22]  Sara Garamszegi,et al.  Therapeutics of Ebola hemorrhagic fever: whole-genome transcriptional analysis of successful disease mitigation. , 2011, The Journal of infectious diseases.

[23]  C. Lang,et al.  Inhibition of Glycogen Synthase Kinase 3&bgr; Activity with Lithium In Vitro Attenuates Sepsis-Induced Changes in Muscle Protein Turnover , 2011, Shock.

[24]  C. Claus,et al.  A renewed focus on the interplay between viruses and mitochondrial metabolism , 2013, Archives of Virology.

[25]  Todd R Golub,et al.  Gene expression–based high-throughput screening(GE-HTS) and application to leukemia differentiation , 2004, Nature Genetics.

[26]  David J. Dries SEPSIS: NEW INSIGHTS, NEW THERAPIES , 2008 .

[27]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Eero Mervaala,et al.  Inhibiting Glycogen Synthase Kinase 3β in Sepsis , 2007 .

[29]  Avi Ma'ayan,et al.  ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments , 2010, Bioinform..

[30]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[31]  Jörg Rahnenführer,et al.  Robert Gentleman, Vincent Carey, Wolfgang Huber, Rafael Irizarry, Sandrine Dudoit (2005): Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2009 .

[32]  Laurent Meijer,et al.  1‐Azakenpaullone Is a Selective Inhibitor of Glycogen Synthase Kinase‐3β. , 2004 .

[33]  Qin Gui,et al.  Expression changes of duplicated genes in allotetraploids of Brassica detected by SRAP-cDNA technique , 2009, Molecular Biology.

[34]  Jeffrey M. Weiss,et al.  Host genetic diversity enables Ebola hemorrhagic fever pathogenesis and resistance , 2014, Science.

[35]  Katrin Stierand,et al.  From Structure Diagrams to Visual Chemical Patterns , 2010, J. Chem. Inf. Model..

[36]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.