Use of big data in drug development for precision medicine

ABSTRACT Drug development has been a costly and lengthy process with an extremely low success rate and lack of consideration of individual diversity in drug response and toxicity. Over the past decade, an alternative “big data” approach has been expanding at an unprecedented pace based on the development of electronic databases of chemical substances, disease gene/protein targets, functional readouts, and clinical information covering inter-individual genetic variations and toxicities. This paradigm shift has enabled systematic, high-throughput, and accelerated identification of novel drugs or repurposed indications of existing drugs for pathogenic molecular aberrations specifically present in each individual patient. The exploding interest from the information technology and direct-to-consumer genetic testing industries has been further facilitating the use of big data to achieve personalized Precision Medicine. Here we overview currently available resources and discuss future prospects.

[1]  Todd R. Golub,et al.  An ecosystem of cancer cell line factories to support a cancer dependency map , 2015, Nature Reviews Genetics.

[2]  R. Jirtle,et al.  Altered ligand binding by insulin-like growth factor II/mannose 6-phosphate receptors bearing missense mutations in human cancers. , 1999, Cancer research.

[3]  G. Getz,et al.  Resensitization to Crizotinib by the Lorlatinib ALK Resistance Mutation L1198F. , 2016, The New England journal of medicine.

[4]  F. Pammolli,et al.  The productivity crisis in pharmaceutical R&D , 2011, Nature Reviews Drug Discovery.

[5]  Y. Hoshida,et al.  Cancer biomarker discovery and validation. , 2015, Translational cancer research.

[6]  Natalie de Souza Genomics: The ENCODE project , 2012, Nature Methods.

[7]  Emilio Benfenati,et al.  In silico methods to predict drug toxicity. , 2013, Current opinion in pharmacology.

[8]  Lawrence D. True,et al.  Integrative Clinical Genomics of Advanced Prostate Cancer , 2015, Cell.

[9]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[10]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[11]  Andrew L. Kung,et al.  A murine lung cancer co-clinical trial identifies genetic modifiers of therapeutic response , 2012, Nature.

[12]  Gabor T. Marth,et al.  An integrated map of structural variation in 2,504 human genomes , 2015, Nature.

[13]  T. Golub Counterpoint: Data first , 2010, Nature.

[14]  G. Patlewicz,et al.  An evaluation of the implementation of the Cramer classification scheme in the Toxtree software , 2008, SAR and QSAR in environmental research.

[15]  N. Schork Personalized medicine: Time for one-person trials , 2015, Nature.

[16]  Takashi Gojobori,et al.  TP Atlas: integration and dissemination of advances in Targeted Proteins Research Program (TPRP)—structural biology project phase II in Japan , 2012, Journal of Structural and Functional Genomics.

[17]  Martin Caffrey,et al.  A comprehensive review of the lipid cubic phase or in meso method for crystallizing membrane and soluble proteins and complexes , 2015, Acta crystallographica. Section F, Structural biology communications.

[18]  Michael Eisenstein GSK collaborates with Apple on ResearchKit , 2015, Nature Biotechnology.

[19]  Andrew M Watkins,et al.  Structure-based inhibition of protein-protein interactions. , 2015, European journal of medicinal chemistry.

[20]  James Devillers,et al.  Methods for building QSARs. , 2013, Methods in molecular biology.

[21]  J. Falls,et al.  Imprinted M6p/Igf2 receptor is mutated in rat liver tumors , 1998, Oncogene.

[22]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[23]  Alessio Farcomeni,et al.  A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion , 2008, Statistical methods in medical research.

[24]  Feng Luan,et al.  Predicting multiple ecotoxicological profiles in agrochemical fungicides: a multi-species chemoinformatic approach. , 2012, Ecotoxicology and environmental safety.

[25]  Benjamin M. Good,et al.  Crowdsourcing for bioinformatics , 2013, Bioinform..

[26]  D. Sargent,et al.  Clinical trial designs incorporating predictive biomarkers. , 2016, Cancer treatment reviews.

[27]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[28]  Geoffrey S Ginsburg,et al.  Genomics-enabled drug repositioning and repurposing: insights from an IOM Roundtable activity. , 2014, JAMA.

[29]  Carol A Marchant,et al.  In Silico Tools for Sharing Data and Knowledge on Toxicity and Metabolism: Derek for Windows, Meteor, and Vitic , 2008, Toxicology mechanisms and methods.

[30]  S. Ramaswamy,et al.  Systematic identification of genomic markers of drug sensitivity in cancer cells , 2012, Nature.

[31]  Zhiyong Lu,et al.  A survey of current trends in computational drug repositioning , 2016, Briefings Bioinform..

[32]  H. Skirton,et al.  Direct to consumer genetic testing: a systematic review of position statements, policies and recommendations , 2012, Clinical genetics.

[33]  J. Dekker,et al.  The long-range interaction landscape of gene promoters , 2012, Nature.

[34]  Dieter Lang,et al.  Predicting drug metabolism: experiment and/or computation? , 2015, Nature Reviews Drug Discovery.

[35]  Adam A. Margolin,et al.  Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas , 2013, Nature Genetics.

[36]  D. Dix,et al.  The ToxCast program for prioritizing toxicity testing of environmental chemicals. , 2007, Toxicological sciences : an official journal of the Society of Toxicology.

[37]  Kunal Roy,et al.  QSAR modeling of toxicity of diverse organic chemicals to Daphnia magna using 2D and 3D descriptors. , 2010, Journal of hazardous materials.

[38]  Todd R Golub,et al.  Expression-based screening identifies the combination of histone deacetylase inhibitors and retinoids for neuroblastoma differentiation , 2008, Proceedings of the National Academy of Sciences.

[39]  Anette Küster,et al.  Pharmaceuticals in the environment: scientific evidence of risks and its regulation , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[40]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[41]  K. Stegmaier,et al.  Genetic and proteomic approaches to identify cancer drug targets , 2011, British Journal of Cancer.

[42]  L. Law Origin of the Resistance of Leukæmic Cells to Folic Acid Antagonists , 1952, Nature.

[43]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[44]  Ferenc Darvas,et al.  HazardExpert: An Expert System for Predicting Chemical Toxicity , 1992 .

[45]  A. Hauschild,et al.  Improved overall survival in melanoma with combined dabrafenib and trametinib. , 2015, The New England journal of medicine.

[46]  Michael Thomas,et al.  Crizotinib versus chemotherapy in advanced ALK-positive lung cancer. , 2013, The New England journal of medicine.

[47]  Richard J. Williams,et al.  Predicted Exposures to Steroid Estrogens in U.K. Rivers Correlate with Widespread Sexual Disruption in Wild Fish Populations , 2005, Environmental health perspectives.

[48]  Luke W. Koblan,et al.  A high-throughput, multiplexed assay for superfamily-wide profiling of enzyme activity , 2014, Nature chemical biology.

[49]  Nadav S. Bar,et al.  Landscape of transcription in human cells , 2012, Nature.

[50]  F. Cappuzzo,et al.  First-line crizotinib versus chemotherapy in ALK-positive lung cancer. , 2014, The New England journal of medicine.

[51]  P. Bork,et al.  Drug Target Identification Using Side-Effect Similarity , 2008, Science.

[52]  J. DiMasi,et al.  Trends in Risks Associated With New Drug Development: Success Rates for Investigational Drugs , 2010, Clinical pharmacology and therapeutics.

[53]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[54]  Benjamin J. Raphael,et al.  Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. , 2013, The New England journal of medicine.

[55]  Jong Hwan Sung,et al.  Using physiologically-based pharmacokinetic-guided “body-on-a-chip” systems to predict mammalian response to drug and chemical exposure , 2014, Experimental biology and medicine.

[56]  M. Dickson,et al.  Key factors in the rising cost of new drug discovery and development , 2004, Nature Reviews Drug Discovery.

[57]  W. Hahn,et al.  Identification of an “Exceptional Responder” Cell Line to MEK1 Inhibition: Clinical Implications for MEK-Targeted Therapy , 2015, Molecular Cancer Research.

[58]  Steven J. M. Jones,et al.  Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas. , 2015, The New England journal of medicine.

[59]  Fabrício F. Costa Big data in biomedicine. , 2014, Drug discovery today.

[60]  Marco Egbring,et al.  Phynx: an open source software solution supporting data management and web‐based patient‐level data review for drug safety studies in the general practice research database and other health care databases , 2010, Pharmacoepidemiology and drug safety.

[61]  Alexander A. Morgan,et al.  Computational Repositioning of the Anticonvulsant Topiramate for Inflammatory Bowel Disease , 2011, Science Translational Medicine.

[62]  Luke A. Gilbert,et al.  Defining principles of combination drug mechanisms of action , 2012, Proceedings of the National Academy of Sciences.

[63]  G. Linette,et al.  Nivolumab and ipilimumab versus ipilimumab in untreated melanoma. , 2015, The New England journal of medicine.

[64]  Sung-Bae Kim,et al.  Pertuzumab, trastuzumab, and docetaxel in HER2-positive metastatic breast cancer. , 2015, The New England journal of medicine.

[65]  S. Ekins,et al.  In silico pharmacology for drug discovery: methods for virtual ligand screening and profiling , 2007, British journal of pharmacology.

[66]  Todd R Golub,et al.  Gene expression–based high-throughput screening(GE-HTS) and application to leukemia differentiation , 2004, Nature Genetics.

[67]  Justin Lamb,et al.  The Connectivity Map: a new tool for biomedical research , 2007, Nature Reviews Cancer.

[68]  David Killock Lung cancer: A new generation of EGFR inhibition , 2015, Nature Reviews Clinical Oncology.

[69]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[70]  Ruili Huang,et al.  Integrated Model of Chemical Perturbations of a Biological Pathway Using 18 In Vitro High-Throughput Screening Assays for the Estrogen Receptor. , 2015, Toxicological sciences : an official journal of the Society of Toxicology.

[71]  L. Law Effects of combinations of antileukemic agents on an acute lymphocytic leukemia of mice. , 1952, Cancer research.

[72]  Caroline McNeil,et al.  NCI-MATCH launch highlights new trial design in precision-medicine era. , 2015, Journal of the National Cancer Institute.

[73]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.

[74]  Beibei Guo,et al.  Optimal two-stage enrichment design correcting for biomarker misclassification , 2018, Statistical methods in medical research.

[75]  Benjamin S. Glicksberg,et al.  Identification of type 2 diabetes subgroups through topological analysis of patient similarity , 2015, Science Translational Medicine.

[76]  J. Fick,et al.  Dilute Concentrations of a Psychiatric Drug Alter Behavior of Fish from Natural Populations , 2013, Science.

[77]  William Stafford Noble,et al.  Machine learning applications in genetics and genomics , 2015, Nature Reviews Genetics.

[78]  J. Mesirov,et al.  Metagene projection for cross-platform, cross-species characterization of global transcriptional states , 2007, Proceedings of the National Academy of Sciences.

[79]  Helen R. Saibil,et al.  Cryo electron microscopy to determine the structure of macromolecular complexes , 2016, Methods.

[80]  Carol A. Mulrooney,et al.  Diversity-Oriented Synthesis-Facilitated Medicinal Chemistry: Toward the Development of Novel Antimalarial Agents , 2014, Journal of medicinal chemistry.

[81]  J. Minna,et al.  ALK inhibition for non-small cell lung cancer: from discovery to therapy in record time. , 2010, Cancer cell.

[82]  R. Jirtle,et al.  Disruption of Ligand Binding to the Insulin-like Growth Factor II/Mannose 6-Phosphate Receptor by Cancer-associated Missense Mutations* , 1999, The Journal of Biological Chemistry.

[83]  A. Redig,et al.  Basket trials and the evolution of clinical trial design in an era of genomic medicine. , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[84]  T. Golub,et al.  Gene expression signature-based chemical genomic prediction identifies a novel class of HSP90 pathway modulators. , 2006, Cancer cell.

[85]  Nathan C. Sheffield,et al.  The accessible chromatin landscape of the human genome , 2012, Nature.

[86]  Cyrus Mehta,et al.  Biomarker-driven population enrichment for adaptive oncology trials with time to event endpoints. , 2016, Statistics in medicine.

[87]  S. Vilar,et al.  High-Throughput Methods for Combinatorial Drug Discovery , 2013, Science Translational Medicine.

[88]  H. Aburatani,et al.  Identification of the transforming EML4–ALK fusion gene in non-small-cell lung cancer , 2007, Nature.

[89]  Yin-tak Woo,et al.  OncoLogic: A Mechanism-Based Expert System for Predicting the Carcinogenic Potential of Chemicals , 2005 .

[90]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .