Inferring gene targets of drugs and chemical compounds from gene expression profiles

Motivation: Finding genes which are directly perturbed or targeted by drugs is of great interest and importance in drug discovery. Several network filtering methods have been created to predict the gene targets of drugs from gene expression data based on an ordinary differential equation model of the gene regulatory network (GRN). A critical step in these methods involves inferring the GRN from the expression data, which is a very challenging problem on its own. In addition, existing network filtering methods require computationally intensive parameter tuning or expression data from experiments with known genetic perturbations or both. Results: We developed a method called DeltaNet for the identification of drug targets from gene expression data. Here, the gene target predictions were directly inferred from the data without a separate step of GRN inference. DeltaNet formulation led to solving an underdetermined linear regression problem, for which we employed least angle regression (DeltaNet-LAR) or LASSO regularization (DeltaNet-LASSO). The predictions using DeltaNet for expression data of Escherichia coli, yeast, fruit fly and human were significantly more accurate than those using network filtering methods, namely mode of action by network identification (MNI) and sparse simultaneous equation model (SSEM). Furthermore, DeltaNet using LAR did not require any parameter tuning and could provide computational speed-up over existing methods. Conclusion: DeltaNet is a robust and numerically efficient tool for identifying gene perturbations from gene expression data. Importantly, the method requires little to no expert supervision, while providing accurate gene target predictions. Availability and implementation: DeltaNet is available on http://www.cabsel.ethz.ch/tools/DeltaNet. Contact: rudi.gunawan@chem.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  S. Drăghici,et al.  Analysis of microarray experiments of gene expression profiling. , 2006, American journal of obstetrics and gynecology.

[2]  Peter Bühlmann,et al.  Predicting causal effects in large-scale systems from observational data , 2010, Nature Methods.

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[5]  T. Gardner,et al.  The mode-of-action by network identification (MNI) algorithm: a network biology approach for molecular target identification , 2006, Nature Protocols.

[6]  J. Collins,et al.  Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks , 2005, Nature Biotechnology.

[7]  Rudiyanto Gunawan,et al.  Ensemble Inference and Inferability of Gene Regulatory Networks , 2014, PloS one.

[8]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[9]  Alexander E. Kel,et al.  “Upstream Analysis”: An Integrated Promoter-Pathway Analysis Approach to Causal Interpretation of Microarray Data , 2015, Microarrays.

[10]  Manuel C. Peitsch,et al.  Systematic Verification of Upstream Regulators of a Computable Cellular Proliferation Network Model on Non-Diseased Lung Cells Using a Dedicated Dataset , 2013, Bioinformatics and biology insights.

[11]  Jeremiah J. Faith,et al.  Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata , 2007, Nucleic Acids Res..

[12]  Catarina Costa,et al.  The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae , 2013, Nucleic Acids Res..

[13]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[14]  Bruce J. Aronow,et al.  ToppCluster: a multiple gene list feature analyzer for comparative enrichment clustering and network-based dissection of biological systems , 2010, Nucleic Acids Res..

[15]  John D. Storey A direct approach to false discovery rates , 2002 .

[16]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[17]  Vitoantonio Bevilacqua,et al.  Scalable high-throughput identification of genetic targets by network filtering , 2013, BMC Bioinformatics.

[18]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[19]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[20]  Eric D. Kolaczyk,et al.  Predicting gene targets of perturbations via network-based filtering of mRNA expression compendia , 2008, Bioinform..

[21]  Peer Bork,et al.  Drug-Induced Regulation of Target Expression , 2010, PLoS Comput. Biol..

[22]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[23]  Yves Moreau,et al.  Galahad: a web server for drug effect analysis from gene expression , 2015, Nucleic Acids Res..

[24]  M. Gerstein,et al.  Genomic analysis of regulatory network dynamics reveals large topological changes , 2004, Nature.

[25]  Avi Ma'ayan,et al.  Expression2Kinases: mRNA profiling linked to multiple upstream regulatory layers , 2012, Bioinform..

[26]  Ahmed Enayetallah,et al.  Causal reasoning on biological networks: interpreting transcriptional changes , 2012, Bioinform..

[27]  E. Marra,et al.  Molecular mechanisms of Saccharomyces cerevisiae stress adaptation and programmed cell death in response to acetic acid , 2013, Front. Microbio..

[28]  Avi Ma'ayan,et al.  Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool , 2013, BMC Bioinformatics.

[29]  Damian Szklarczyk,et al.  STITCH 4: integration of protein–chemical interactions with user data , 2013, Nucleic Acids Res..

[30]  María Rodríguez Martínez,et al.  Elucidating Compound Mechanism of Action by Network Perturbation Analysis Graphical , 2015 .

[31]  T. Kivioja,et al.  Transcriptional Networks Controlling the Cell Cycle , 2013, G3: Genes | Genomes | Genetics.

[32]  Di Wu,et al.  Bioinformatics analysis of the epitope regions for norovirus capsid protein , 2013, BMC Bioinformatics.

[33]  Mariano J. Alvarez,et al.  A human B-cell interactome identifies MYB and FOXM1 as master regulators of proliferation in germinal centers , 2010, Molecular systems biology.

[34]  Michael N. Hall,et al.  TOR signalling in bugs, brain and brawn , 2003, Nature Reviews Molecular Cell Biology.

[35]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[36]  Benjamin E Dunmore,et al.  Gene network inference and visualization tools for biologists: application to new human transcriptome datasets , 2011, Nucleic acids research.

[37]  J. Hasty,et al.  Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Helen E. Parkinson,et al.  ArrayExpress—a public database of microarray experiments and gene expression profiles , 2006, Nucleic Acids Res..

[39]  Manuel C. Peitsch,et al.  Assessment of network perturbation amplitudes by applying high-throughput data to causal biological networks , 2012, BMC Systems Biology.

[40]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[42]  R. Tagliaferri,et al.  Discovery of drug mode of action and drug repositioning from transcriptional responses , 2010, Proceedings of the National Academy of Sciences.

[43]  Avi Ma'ayan,et al.  KEA: kinase enrichment analysis , 2009, Bioinform..

[44]  F. P. Roth,et al.  Discovering the Targets of Drugs Via Computational Systems Biology* , 2011, The Journal of Biological Chemistry.

[45]  Julio R. Banga,et al.  Inference of complex biological networks: distinguishability issues and optimization-based solutions , 2011, BMC Systems Biology.