Bioinformatics challenges for personalized medicine

Motivation: Widespread availability of low-cost, full genome sequencing will introduce new challenges for bioinformatics. Results: This review outlines recent developments in sequencing technologies and genome analysis methods for application in personalized medicine. New methods are needed in four areas to realize the potential of personalized medicine: (i) processing large-scale robust genomic data; (ii) interpreting the functional effect and the impact of genomic variation; (iii) integrating systems data to relate complex genetic interactions with phenotypes; and (iv) translating these discoveries into medical practice. Contact: russ.altman@stanford.edu Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[2]  Russ B. Altman,et al.  Pharmacogenomics and bioinformatics: PharmGKB. , 2010, Pharmacogenomics.

[3]  Eytan Ruppin,et al.  MuD: an interactive web server for the prediction of non-neutral substitutions using protein structural data , 2010, Nucleic Acids Research.

[4]  P. Thomas,et al.  Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Hong Wang,et al.  Prioritizing risk pathways: a novel association approach to searching for disease pathways fusing SNPs and pathways , 2009, Bioinform..

[6]  David M. Reif,et al.  Novel methods for detecting epistasis in pharmacogenomics studies. , 2007, Pharmacogenomics.

[7]  Serge Batalov,et al.  Susceptibility and modifier genes in Portuguese transthyretin V30M amyloid polyneuropathy: complexity in a single-gene disease. , 2005, Human molecular genetics.

[8]  Elizabeth A. Heron,et al.  The SNP ratio test: pathway analysis of genome-wide association datasets , 2009, Bioinform..

[9]  J. Moult,et al.  Identification and analysis of deleterious human SNPs. , 2006, Journal of molecular biology.

[10]  David L Veenstra,et al.  Association between CYP2C9 genetic variants and anticoagulation-related outcomes during warfarin therapy. , 2002, JAMA.

[11]  David S. Wishart,et al.  Nucleic Acids Research Polysearch: a Web-based Text Mining System for Extracting Relationships between Human Diseases, Genes, Mutations, Drugs Polysearch: a Web-based Text Mining System for Extracting Relationships between Human Diseases, Genes, Mutations, Drugs and Metabolites , 2008 .

[12]  Howard L McLeod,et al.  Carbamazepine, HLA-B*1502 and risk of Stevens-Johnson syndrome and toxic epidermal necrolysis: US FDA recommendations. , 2008, Pharmacogenomics.

[13]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[14]  大橋 渉 Benefits of pharmacogenomics in drug development : earlier launch of drugs and less adverse events , 2009 .

[15]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[16]  David R. Westhead,et al.  A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function , 2003, Bioinform..

[17]  Brian F. Gage,et al.  Pharmacogenetics of warfarin: regulatory, scientific, and clinical issues , 2008, Journal of Thrombosis and Thrombolysis.

[18]  Scott M. Williams,et al.  challenges for genome-wide association studies , 2010 .

[19]  François Stricher,et al.  The FoldX web server: an online force field , 2005, Nucleic Acids Res..

[20]  Allen D. Roses,et al.  Pharmacogenetics and drug development: the path to safer and more effective drugs , 2004, Nature Reviews Genetics.

[21]  Russ B. Altman,et al.  Author ' s personal copy Using text to build semantic networks for pharmacogenomics , 2010 .

[22]  Piero Fariselli,et al.  A neural-network-based method for predicting protein stability changes upon single point mutations , 2004, ISMB/ECCB.

[23]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[24]  E. Capriotti,et al.  Use of estimated evolutionary strength at the codon level improves the prediction of disease‐related protein mutations in humans , 2008, Human mutation.

[25]  Dan M Roden,et al.  Genome-wide association studies in pharmacogenomics: successes and lessons , 2013, Pharmacogenetics and genomics.

[26]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[27]  Emidio Capriotti,et al.  Bioinformatics Original Paper Predicting the Insurgence of Human Genetic Diseases Associated to Single Point Protein Mutations with Support Vector Machines and Evolutionary Information , 2022 .

[28]  P. Bork,et al.  G2D: a tool for mining genes associated with disease , 2005, BMC Genetics.

[29]  D. Wysowski,et al.  Bleeding complications with warfarin use: a prevalent adverse effect resulting in regulatory action. , 2007, Archives of internal medicine.

[30]  A J Atkinson,et al.  Systems Clinical Pharmacology , 2010, Clinical pharmacology and therapeutics.

[31]  D. Levinson,et al.  Identification and analysis of error types in high-throughput genotyping. , 2000, American journal of human genetics.

[32]  Andreas Fregin,et al.  Mutations in VKORC1 cause warfarin resistance and multiple coagulation factor deficiency type 2 , 2004, Nature.

[33]  Kai Wang,et al.  Pathway-based approaches for analysis of genomewide association studies. , 2007, American journal of human genetics.

[34]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[35]  Yael Garten,et al.  Recent progress in automatically extracting information from the pharmacogenomic literature. , 2010, Pharmacogenomics.

[36]  R Bellman,et al.  A MATHEMATICAL THEORY OF ADAPTIVE CONTROL PROCESSES. , 1959, Proceedings of the National Academy of Sciences of the United States of America.

[37]  J. Mullikin,et al.  Genomic features defining exonic variants that modulate splicing , 2010, Genome Biology.

[38]  Wei-Hao Wang,et al.  Studies , 1926 .

[39]  Wei Zhang,et al.  Use of cell lines in the investigation of pharmacogenetic loci. , 2009, Current pharmaceutical design.

[40]  R. Nielsen,et al.  Population genetic inference from genomic sequence variation. , 2010, Genome research.

[41]  Neil A Busis How can I choose the best electronic health record system for my practice? , 2010, Neurology.

[42]  Daniel G MacArthur,et al.  The promise and reality of personal genomics , 2009, Genome Biology.

[43]  Yan Zhang,et al.  CanPredict: a computational tool for predicting cancer-associated missense mutations , 2007, Nucleic Acids Res..

[44]  Joost Schymkowitz,et al.  Bioinformatics Applications Note Snpeffect V2.0: a New Step in Investigating the Molecular Phenotypic Effects of Human Non-synonymous Snps , 2022 .

[45]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[46]  S. Brunak,et al.  Generating Genome‐Scale Candidate Gene Lists for Pharmacogenomics , 2009, Clinical pharmacology and therapeutics.

[47]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[48]  James Bailey,et al.  is-rSNP: a novel technique for in silico regulatory SNP detection , 2010, BMC Bioinformatics.

[49]  P. Rosenberg,et al.  Pathway analysis by adaptive combination of P‐values , 2009, Genetic epidemiology.

[50]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[51]  P. Radivojac,et al.  An integrated approach to inferring gene–disease associations in humans , 2008, Proteins.

[52]  Emmanouil Collab A map of human genome variation from population-scale sequencing , 2011, Nature.

[53]  L. Hood,et al.  Systems medicine: the future of medical genomics and healthcare , 2009, Genome Medicine.

[54]  Deborah A Nickerson,et al.  Effect of VKORC1 haplotypes on transcriptional regulation and warfarin dose. , 2005, The New England journal of medicine.

[55]  SNAP predicts effect of mutations on protein function , 2008, Bioinform..

[56]  David Haussler,et al.  LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources , 2005, Bioinform..

[57]  Amy L. McGuire,et al.  An unwelcome side effect of direct-to-consumer personal genome testing: raiding the medical commons. , 2008, JAMA.

[58]  P. Bork,et al.  Human non-synonymous SNPs: server and survey. , 2002, Nucleic acids research.

[59]  P. Stenson,et al.  The Human Gene Mutation Database: 2008 update , 2009, Genome Medicine.

[60]  Yan Cui,et al.  Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information , 2005, Bioinform..

[61]  Bart De Moor,et al.  Endeavour update: a web resource for gene prioritization in multiple species , 2008, Nucleic Acids Res..

[62]  Robert B. Hartlage,et al.  This PDF file includes: Materials and Methods , 2009 .

[63]  Anaïs Mottaz,et al.  Bioinformatics Applications Note Databases and Ontologies Easy Retrieval of Single Amino-acid Polymorphisms and Phenotype Information Using Swissvar , 2022 .

[64]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[65]  R B Altman,et al.  The Pharmacogenetics Research Network: From SNP Discovery to Clinical Drug Response , 2007, Clinical pharmacology and therapeutics.

[66]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[67]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[68]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[69]  A. Hopkins Network pharmacology , 2007, Nature Biotechnology.

[70]  G. V. Paolini,et al.  Global mapping of pharmacological space , 2006, Nature Biotechnology.

[71]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[72]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[73]  R. Weinshilboum,et al.  Thiopurine pharmacogenetics: clinical and molecular studies of thiopurine methyltransferase. , 2001, Drug metabolism and disposition: the biological fate of chemicals.

[74]  Joel T Dudley,et al.  In silico research in the era of cloud computing , 2010, Nature Biotechnology.

[75]  Carlo Gambacorti-Passerini,et al.  Part I: Milestones in personalised medicine--imatinib. , 2008, The Lancet. Oncology.

[76]  Leyla Isik,et al.  Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. , 2009, Cancer research.

[77]  P. Shannon,et al.  Exome sequencing identifies the cause of a Mendelian disorder , 2009, Nature Genetics.

[78]  David J. Arenillas,et al.  In Silico Detection of Sequence Variations Modifying Transcriptional Regulation , 2007, PLoS Comput. Biol..

[79]  Peng Yue,et al.  SNPs3D: Candidate gene and SNP selection for association studies , 2006, BMC Bioinformatics.

[80]  M. Vihinen,et al.  Performance of mutation pathogenicity prediction methods on missense variants , 2011, Human mutation.

[81]  Alexander A. Morgan,et al.  Clinical assessment incorporating a personal genome , 2010, The Lancet.

[82]  Russ B. Altman,et al.  Extending and evaluating a warfarin dosing algorithm that includes CYP4F2 and pooled rare variants of CYP2C9 , 2010, Pharmacogenetics and genomics.

[83]  Michael Gold,et al.  Rosiglitazone Monotherapy in Mild-to-Moderate Alzheimer’s Disease: Results from a Randomized, Double-Blind, Placebo-Controlled Phase III Study , 2010, Dementia and Geriatric Cognitive Disorders.

[84]  D. Noble,et al.  Systems Biology: An Approach , 2010, Clinical pharmacology and therapeutics.

[85]  Rachel Karchin,et al.  Next generation tools for the annotation of human SNPs , 2009, Briefings Bioinform..

[86]  Thomas Lengauer,et al.  Improving disease gene prioritization using the semantic similarity of Gene Ontology terms , 2010, Bioinform..

[87]  C. Hudis Trastuzumab--mechanism of action and use in clinical practice. , 2007, The New England journal of medicine.

[88]  Wei Zhang,et al.  PACdb: a database for cell-based pharmacogenomics. , 2010, Pharmacogenetics and genomics.

[89]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[90]  K. Sirotkin,et al.  The NCBI dbGaP database of genotypes and phenotypes , 2007, Nature Genetics.

[91]  David M. Reif,et al.  Combinatorial Pharmacogenetics , 2005, Nature Reviews Drug Discovery.

[92]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[93]  Hiroshi Tanaka,et al.  Benefits of Pharmacogenomics in Drug Development—Earlier Launch of Drugs and Less Adverse Events , 2010, Journal of Medical Systems.

[94]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[95]  Sang Hong Lee,et al.  Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP Data , 2008, PLoS genetics.

[96]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[97]  Alan F. Scott,et al.  McKusick's Online Mendelian Inheritance in Man (OMIM®) , 2008, Nucleic Acids Res..

[98]  D. Chasman On the utility of gene set methods in genomewide association studies of quantitative traits , 2008, Genetic epidemiology.

[99]  Mingming Jia,et al.  COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer , 2009, Nucleic Acids Res..

[100]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[101]  K. Frazer,et al.  Human genetic variation and its contribution to complex traits , 2009, Nature Reviews Genetics.

[102]  J. Swen,et al.  From evidence based medicine to mechanism based medicine. Reviewing the role of pharmacogenetics , 2011, International Journal of Clinical Pharmacy.

[103]  Martin Reczko,et al.  Lost in translation: an assessment and perspective for computational microRNA target identification , 2009, Bioinform..

[104]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[105]  Modesto Orozco,et al.  PMUT: a web-based tool for the annotation of pathological mutations on proteins , 2005, Bioinform..

[106]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[107]  Sean D. Mooney,et al.  Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis , 2005, Briefings Bioinform..

[108]  John C. Marioni,et al.  Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data , 2009, Bioinform..

[109]  Nor Hayati Othman,et al.  A review of feature selection techniques via gene expression profiles , 2008, 2008 International Symposium on Information Technology.

[110]  Jan van den Berg,et al.  Systems Biology and Pharmacology , 2010, Clinical pharmacology and therapeutics.

[111]  Mark Gerstein,et al.  Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. , 2003, Nucleic acids research.

[112]  Peter M Visscher,et al.  Prediction of individual genetic risk to disease from genome-wide association studies. , 2007, Genome research.

[113]  Francis S. Collins,et al.  Variations on a Theme: Cataloging Human DNA Sequence Variation , 1997, Science.

[114]  Melissa S. Cline,et al.  Using bioinformatics to predict the functional impact of SNVs , 2011, Bioinform..

[115]  John D. Storey,et al.  Multiple Locus Linkage Analysis of Genomewide Expression in Yeast , 2005, PLoS biology.

[116]  Hongyu Zhao,et al.  Statistical Power of Model Selection Strategies for Genome-Wide Association Studies , 2009, PLoS genetics.

[117]  A. Roses,et al.  The Medical and Economic Roles of Pipeline Pharmacogenetics: Alzheimer's Disease as a Model of Efficacy and HLA-B*5701 as a Model of Safety , 2009, Neuropsychopharmacology.

[118]  Elizabeth Foot,et al.  Pharmacogenetics--pivotal to the future of the biopharmaceutical industry. , 2010, Drug discovery today.

[119]  Naomi R. Wray,et al.  association studies Prediction of individual genetic risk to disease from genome-wide , 2007 .

[120]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..

[121]  D. Roden,et al.  Development of a Large‐Scale De‐Identified DNA Biobank to Enable Personalized Medicine , 2008, Clinical pharmacology and therapeutics.

[122]  I. Cascorbi,et al.  Clozapine-induced agranulocytosis in schizophrenic Caucasians: confirming clues for associations with human leukocyte class I and II antigens , 2007, The Pharmacogenomics Journal.

[123]  R. Altman,et al.  Estimation of the warfarin dose with clinical and pharmacogenetic data. , 2009, The New England journal of medicine.

[124]  Predrag Radivojac,et al.  Automated inference of molecular mechanisms of disease from amino acid substitutions , 2009, Bioinform..

[125]  Anushya Muruganujan,et al.  PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification , 2003, Nucleic Acids Res..

[126]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[127]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[128]  Gary D. Bader,et al.  The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function , 2010, Nucleic Acids Res..

[129]  E. Capriotti,et al.  Functional annotations improve the predictive score of human disease‐related mutations in proteins , 2009, Human mutation.

[130]  Bart De Moor,et al.  A guide to web tools to prioritize candidate genes , 2011, Briefings Bioinform..

[131]  J. O’Connell,et al.  Association of cytochrome P450 2C19 genotype with the antiplatelet effect and clinical efficacy of clopidogrel therapy. , 2009, JAMA.

[132]  David J. Porteous,et al.  SUSPECTS : enabling fast and effective prioritization of positional candidates , 2005 .

[133]  Clive E. Bowman,et al.  Genetic variations in HLA-B region and hypersensitivity reactions to abacavir , 2002, The Lancet.

[134]  S. Tavtigian,et al.  In silico analysis of missense substitutions using sequence‐alignment based methods , 2008, Human mutation.

[135]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[136]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[137]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[138]  H. Willard,et al.  Genomic and personalized medicine: foundations and applications. , 2009, Translational research : the journal of laboratory and clinical medicine.

[139]  Marylyn D. Ritchie,et al.  Multilocus Analysis of Hypertension: A Hierarchical Approach , 2004, Human Heredity.

[140]  M. Gerstein,et al.  Variation in Transcription Factor Binding Among Humans , 2010, Science.

[141]  Heng Li,et al.  A survey of sequence alignment algorithms for next-generation sequencing , 2010, Briefings Bioinform..

[142]  Richard J. B. Dobson,et al.  Predicting deleterious nsSNPs: an analysis of sequence and structural attributes , 2006, BMC Bioinformatics.