Translational Bioinformatics Embraces Big Data

We review the latest trends and major developments in translational bioinformatics in the year 2011-2012. Our emphasis is on highlighting the key events in the field and pointing at promising research areas for the future. The key take-home points are: • Translational informatics is ready to revolutionize human health and healthcare using large-scale measurements on individuals. • Data-centric approaches that compute on massive amounts of data (often called "Big Data") to discover patterns and to make clinically relevant predictions will gain adoption. • Research that bridges the latest multimodal measurement technologies with large amounts of electronic healthcare data is increasing; and is where new breakthroughs will occur.

[1]  R. Altman,et al.  Detecting Drug Interactions From Adverse‐Event Reports: Interaction Between Paroxetine and Pravastatin Increases Blood Glucose Levels , 2011, Clinical pharmacology and therapeutics.

[2]  Suzette J. Bielinski,et al.  Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study , 2012, J. Am. Medical Informatics Assoc..

[3]  Nilesh J Samani,et al.  The personal genome—the future of personalised medicine? , 2010, The Lancet.

[4]  R. O’Neill,et al.  Use of Screening Algorithms and Computer Systems to Efficiently Signal Higher-Than-Expected Combinations of Drugs and Events in the US FDA’s Spontaneous Reports Database , 2002, Drug safety.

[5]  Pedro J. Caraballo,et al.  Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus , 2012, J. Am. Medical Informatics Assoc..

[6]  Alexander A. Morgan,et al.  Clinical utility of sequence-based genotype compared with that derivable from genotyping arrays , 2012, J. Am. Medical Informatics Assoc..

[7]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[8]  Prakash M. Nadkarni,et al.  Drug safety surveillance using de-identified EMR and claims data: issues and challenges , 2010, J. Am. Medical Informatics Assoc..

[9]  Ben Y. Reis,et al.  Predicting Adverse Drug Events Using Pharmacological Network Models , 2011, Science Translational Medicine.

[10]  Siddhartha R. Dalal,et al.  Using information mining of the medical literature to improve drug safety , 2011, J. Am. Medical Informatics Assoc..

[11]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[12]  C. Chute,et al.  Electronic Medical Records for Genetic Research: Results of the eMERGE Consortium , 2011, Science Translational Medicine.

[13]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[14]  M. Snir,et al.  Big data, but are we ready? , 2011, Nature Reviews Genetics.

[15]  Y. Lussier,et al.  The Emergence of Genome-Based Drug Repositioning , 2011, Science Translational Medicine.

[16]  M. Massagli,et al.  Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm , 2011, Nature Biotechnology.

[17]  A. Bate,et al.  Quantitative signal detection using spontaneous ADR reporting , 2009, Pharmacoepidemiology and drug safety.

[18]  Carol Friedman,et al.  Facilitating adverse drug event detection in pharmacovigilance databases using molecular structure similarity: application to rhabdomyolysis , 2011, J. Am. Medical Informatics Assoc..

[19]  H. Jacob,et al.  A timely arrival for genomic medicine , 2011, Genetics in Medicine.

[20]  G. Nolan,et al.  Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology , 2011, Nature Reviews Genetics.

[21]  Chuong B. Do,et al.  Efficient Replication of Over 180 Genetic Associations with Self-Reported Medical Data , 2011 .

[22]  David P Bick,et al.  Making a definitive diagnosis: Successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease , 2011, Genetics in Medicine.

[23]  Russ B. Altman,et al.  2010 Translational bioinformatics year in review , 2011, J. Am. Medical Informatics Assoc..

[24]  Sally Okun,et al.  Patient-reported Outcomes as a Source of Evidence in Off-Label Prescribing: Analysis of Data From PatientsLikeMe , 2011, Journal of medical Internet research.

[25]  J. Frankovich,et al.  Evidence-based medicine in the EMR era. , 2011, The New England journal of medicine.

[26]  A. Butte,et al.  Predicting Adverse Drug Reactions Using Publicly Available PubChem BioAssay Data , 2011, Clinical pharmacology and therapeutics.

[27]  Nigam H. Shah,et al.  The coming age of data-driven medicine: translational bioinformatics' next frontier , 2012, J. Am. Medical Informatics Assoc..

[28]  Kenneth D. Mandl,et al.  The Tell-Tale Heart: Population-Based Surveillance Reveals an Association of Rofecoxib and Celecoxib with Myocardial Infarction , 2007, PloS one.

[29]  Riccardo Bellazzi,et al.  Stochastic model search with binary outcomes for genome-wide association studies , 2012, J. Am. Medical Informatics Assoc..

[30]  Søren Brunak,et al.  Using Electronic Patient Records to Discover Disease Correlations and Stratify Patient Cohorts , 2011, PLoS Comput. Biol..

[31]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[32]  Björn-Olav Dozo,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010 .

[33]  Alexander A. Morgan,et al.  Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data , 2011, Science Translational Medicine.

[34]  Alexander A. Morgan,et al.  Clinical assessment incorporating a personal genome , 2010, The Lancet.

[35]  Alexander A. Morgan,et al.  Computational Repositioning of the Anticonvulsant Topiramate for Inflammatory Bowel Disease , 2011, Science Translational Medicine.

[36]  D. Dore,et al.  Use of a claims-based active drug safety surveillance system to assess the risk of acute pancreatitis with exenatide or sitagliptin compared to metformin or glyburide. , 2009, Current medical research and opinion.

[37]  Steven Ruggles,et al.  Big Data: Large-Scale Historical Infrastructure from the Minnesota Population Center , 2011, Historical methods.

[38]  Cédrick Fairon,et al.  Annotation analysis for testing drug safety signals using unstructured clinical notes , 2012, J. Biomed. Semant..

[39]  Stephanie Chung,et al.  The FDA drug safety surveillance program: adverse event reporting trends. , 2011, Archives of internal medicine.

[40]  R. Sharan,et al.  PREDICT: a method for inferring novel drug indications with application to personalized medicine , 2011, Molecular systems biology.

[41]  R. Sundberg,et al.  A statistical methodology for drug–drug interaction surveillance , 2008, Statistics in medicine.

[42]  Carol Friedman,et al.  Mining multi-item drug adverse effect associations in spontaneous reporting systems , 2010, BMC Bioinformatics.

[43]  R. Altman,et al.  Data-Driven Prediction of Drug Effects and Interactions , 2012, Science Translational Medicine.

[44]  Bill Fox Using big data for big impact. How predictive modeling can affect patient outcomes. , 2012, Health management technology.

[45]  Richard B. Berlin,et al.  Predicting adverse drug events from personal health messages. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[46]  P Ryan,et al.  Novel Data‐Mining Methodologies for Adverse Drug Event Discovery and Analysis , 2012, Clinical pharmacology and therapeutics.

[47]  M. Guyer,et al.  Charting a course for genomic medicine from base pairs to bedside , 2011, Nature.