Big Data: Challenge and Opportunity for Translational and Industrial Research in Healthcare

Research and innovation are constant imperatives for the healthcare sector: medicine, biology and biotechnology support it, and more recently computational and data-driven disciplines gained relevance to handle the massive amount of data this sector is and will be generating. To be effective in translational and healthcare industrial research, big data in the life science domain need to be organized, well annotated, catalogued, correlated and integrated: the biggest the data silos at hand, the stronger the need for organization and tidiness. The degree of such organization marks the transition from data to knowledge for strategic decision making. Thus the challenge for the use of big data in industrial research is the possibility to have effective and coherent data annotation, aimed at integration of heterogeneous domains such as different OMICs and non-OMICs (traditional) data sources. Holistic approaches enabling an acknowledged management of big data, often driven by machine learning methods, can thus trigger a change of industrial research accelerating the process from discovery to product delivery. For instance, the main pillars of industrial R&D processes for vaccines or drug development, include initial discovery, early - late pre clinics, pre-industrialization, clinical phases and finally registration - commercialization. The passage from one step to another is regulated by stringent pass/fail criteria. Bottlenecks of the R&D process are often represented by animal and human studies, which could be rationalized by surrogate in vitro assays as well as by predictive molecular and cellular signatures and models. The impact of big data in healthcare industrial research is to address such bottlenecks by providing actionable information and new knowledge so as to accelerate the development process in a cost effective way. Case studies will be discussed for the effective use of electronic health records, the leverage of network analysis methods for drug repurposing and the development of vaccines towards human pathologies.

[1]  S. Kaufmann,et al.  Molecular signatures for vaccine development. , 2015, Vaccine.

[2]  Team Sg The Saudi Human Genome Program: An oasis in the desert of Arab medicine is providing clues to genetic disease. , 2015 .

[3]  Kenny B. Lipkowitz,et al.  Abuses of Molecular Mechanics: Pitfalls to Avoid , 1995 .

[4]  Richard A. Young,et al.  Insights into host responses against pathogens from transcriptional profiling , 2005, Nature Reviews Microbiology.

[5]  S. Steinhubl,et al.  High-Definition Medicine , 2017, Cell.

[6]  John P. A. Ioannidis,et al.  Reproducible Research Practices and Transparency across the Biomedical Literature , 2016, PLoS biology.

[7]  Alan W Walker,et al.  Studying the Human Microbiota. , 2016, Advances in experimental medicine and biology.

[8]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[9]  Pradeep Kumar Ray,et al.  Privacy Challenges in the Use of eHealth Systems for Public Health Management , 2010, Int. J. E Health Medical Commun..

[10]  Ola Engkvist,et al.  On the Integration of In Silico Drug Design Methods for Drug Repurposing , 2017, Front. Pharmacol..

[11]  R. Rappuoli,et al.  Reverse vaccinology 2.0: Human immunology instructs vaccine antigen design , 2016, The Journal of experimental medicine.

[12]  J. Praestgaard,et al.  mTOR inhibition improves immune function in the elderly , 2014, Science Translational Medicine.

[13]  R. Solé,et al.  The topology of drug-target interaction networks: implicit dependence on drug properties and target families. , 2009, Molecular bioSystems.

[14]  A. Ullrich,et al.  Paul Ehrlich's magic bullet concept: 100 years of progress , 2008, Nature Reviews Cancer.

[15]  H. Stefánsson,et al.  Mapping of a familial essential tremor gene, FET1, to chromosome 3q13 , 1997, Nature Genetics.

[16]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[17]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[18]  Philip E. Bourne,et al.  The NIH Big Data to Knowledge (BD2K) initiative , 2015, J. Am. Medical Informatics Assoc..

[19]  Jerome H. Kim,et al.  Dissecting Polyclonal Vaccine-Induced Humoral Immunity against HIV Using Systems Serology , 2015, Cell.

[20]  오윤석,et al.  Adjuvants , 2021, Visceral Leishmaniasis.

[21]  Jimeng Sun,et al.  Federated Tensor Factorization for Computational Phenotyping , 2017, KDD.

[22]  Sunghoon Kim,et al.  Rational drug repositioning guided by an integrated pharmacological network of protein, disease and drug , 2012, BMC Systems Biology.

[23]  Keith Marsolo,et al.  Preparing an annotated gold standard corpus to share with extramural investigators for de-identification research , 2014, J. Biomed. Informatics.

[24]  Hongjin Wu,et al.  Single-Cell Sequencing for Drug Discovery and Drug Development. , 2017, Current topics in medicinal chemistry.

[25]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[26]  G. Grandi,et al.  Approach to discover T- and B-cell antigens of intracellular pathogens applied to the design of Chlamydia trachomatis vaccines , 2011, Proceedings of the National Academy of Sciences.

[27]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[28]  Euan A Ashley,et al.  A public resource facilitating clinical use of genomes , 2012, Proceedings of the National Academy of Sciences.

[29]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[30]  A. Barabasi,et al.  A disease module in the interactome explains disease heterogeneity, drug response and captures novel pathways and genes in asthma. , 2015, Human molecular genetics.

[31]  B. Pulendran Systems vaccinology: Probing humanity’s diverse immune systems with vaccines , 2014, Proceedings of the National Academy of Sciences.

[32]  Yongqun He,et al.  Ontology-supported research on vaccine efficacy, safety and integrative biological networks , 2014, Expert Review of Vaccines.

[33]  A. Harandi,et al.  Molecular signatures of vaccine adjuvants. , 2015, Vaccine.

[34]  R. Rappuoli,et al.  Reverse vaccinology: a genome-based approach for vaccine development , 2002, Expert opinion on biological therapy.

[35]  Ravi Iyengar,et al.  Network analyses in systems pharmacology , 2009, Bioinform..

[36]  J. Pulley,et al.  Community engagement in biobanking: Experiences from the eMERGE Network , 2010, Genomics, society, and policy.

[37]  G. V. Paolini,et al.  Global mapping of pharmacological space , 2006, Nature Biotechnology.

[38]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[39]  D. Dimitrov Medical Internet of Things and Big Data in Healthcare , 2016, Healthcare informatics research.

[40]  Christopher G. Chute,et al.  A Standards-based Semantic Metadata Repository to Support EHR-driven Phenotype Authoring and Execution , 2015, MedInfo.

[41]  N. Hacohen,et al.  Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors , 2017, Science.

[42]  Mark M. Davis,et al.  Systems analysis of sex differences reveals an immunosuppressive role for testosterone in the response to influenza vaccination , 2013, Proceedings of the National Academy of Sciences.

[43]  Amy Maxmen Google spin-off deploys wearable electronics for huge health study , 2017, Nature.

[44]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[45]  Eva K. Lee,et al.  Systems Biology of Seasonal Influenza Vaccination in Humans , 2011, Nature Immunology.

[46]  Tudor I. Oprea,et al.  ChemInform Abstract: Quantifying the Relationships among Drug Classes. , 2008 .

[47]  José Luis Fernández Alemán,et al.  Security and privacy in electronic health records: A systematic literature review , 2013, J. Biomed. Informatics.

[48]  Travers Ching,et al.  Single-Cell Transcriptomics Bioinformatics and Computational Challenges , 2016, Front. Genet..

[49]  Shiwen Zhao,et al.  A co-module approach for elucidating drug-disease associations and revealing their molecular basis , 2012, Bioinform..

[50]  Matheus C. Bürger,et al.  Sequential Infection with Common Pathogens Promotes Human-like Immune Gene Expression and Altered Vaccine Response. , 2016, Cell host & microbe.

[51]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[52]  P. Kellokumpu-Lehtinen,et al.  Results of treatment in irradiated testicular seminoma patients. , 1990, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[53]  Xiao Li,et al.  Digital Health: Tracking Physiomes and Activity Using Wearable Biosensors Reveals Useful Health-Related Information , 2017, PLoS biology.

[54]  Denis Fourches,et al.  Characterizing the Chemical Space of ERK2 Kinase Inhibitors Using Descriptors Computed from Molecular Dynamics Trajectories , 2017, J. Chem. Inf. Model..

[55]  Molly S Bray,et al.  Early patterns of gene expression correlate with the humoral immune response to influenza vaccination in humans. , 2011, The Journal of infectious diseases.

[56]  Feng Luo,et al.  Modular organization of protein interaction networks , 2007, Bioinform..

[57]  Yuanlin Song,et al.  Single cell sequencing: a distinct new field , 2017, Clinical and Translational Medicine.

[58]  A. Barabasi,et al.  Network-based in silico drug efficacy screening , 2016, Nature Communications.

[59]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[60]  Luonan Chen,et al.  Network-based drug repositioning. , 2013, Molecular bioSystems.

[61]  Jun Wang,et al.  The 3,000 rice genomes project: new opportunities and challenges for future rice research , 2014, GigaScience.

[62]  K. Kinzler,et al.  Detection of Somatic TP53 Mutations in Tampons of Patients With High-Grade Serous Ovarian Cancer , 2014, Obstetrics and gynecology.

[63]  Adebayo Omotosho,et al.  A Criticism of the Current Security, Privacy and Accountability Issues in Electronic Health Records , 2015, ArXiv.

[64]  C. Saha,et al.  The Curtin-Hammett principle , 2016 .

[65]  Philippe Ravaud,et al.  Blockchain technology for improving clinical research quality , 2017, Trials.

[66]  Gil McVean,et al.  The 100,000 Genomes Project Protocol , 2017 .

[67]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[68]  John D. Ainsworth,et al.  Enabling Patient Control of Personal Electronic Health Records Through Distributed Ledger Technology , 2017, MedInfo.

[69]  Paul A. Harris,et al.  PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability , 2016, J. Am. Medical Informatics Assoc..

[70]  Tom Fawcett Mining the Quantified Self: Personal Knowledge Discovery as a Challenge for Data Science , 2015, Big Data.

[71]  O. Tapia Quantum photonic base states: concept and molecular modeling. Managing chemical process descriptions beyond semi-classic schemes , 2014, Journal of Molecular Modeling.

[72]  A. Butte,et al.  Creation and implications of a phenome-genome network , 2006, Nature Biotechnology.

[73]  Jingde Zhu,et al.  A year of great leaps in genome research , 2012, Genome Medicine.

[74]  Saudi Genome Project Team The Saudi Human Genome Program: An oasis in the desert of Arab medicine is providing clues to genetic disease. , 2015, IEEE Pulse.

[75]  M. Fenton,et al.  Immunobiology of influenza vaccines. , 2013, Chest.

[76]  Chris Hodapp,et al.  Unsupervised Learning for Computational Phenotyping , 2016, ArXiv.

[77]  A. Mccarthy Development , 1996, Current Opinion in Neurobiology.

[78]  E. Juengst,et al.  From "Personalized" to "Precision" Medicine: The Ethical and Social Implications of Rhetorical Reform in Genomic Medicine. , 2016, The Hastings Center report.

[79]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[80]  J. Galson,et al.  Studying the antibody repertoire after vaccination: practical applications. , 2014, Trends in immunology.

[81]  Dragomir R. Radev,et al.  Identifying gene-disease associations using centrality on a literature mined gene-interaction network , 2008, ISMB.

[82]  M. Schroeder,et al.  Drug repositioning through incomplete bi-cliques in an integrated drug-target-disease network. , 2012, Integrative biology : quantitative biosciences from nano to macro.

[83]  Yongqun He,et al.  Vaxign: The First Web-Based Vaccine Design Program for Reverse Vaccinology and Applications for Vaccine Development , 2010, Journal of biomedicine & biotechnology.

[84]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[85]  Jing Chen,et al.  Disease candidate gene identification and prioritization using protein interaction networks , 2009, BMC Bioinformatics.

[86]  David L. Chandler Frontiers in Male Contraception: There are few alternatives now, but several promising avenues of research are under way. , 2015, IEEE Pulse.

[87]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[88]  J. Aldrich Correlations Genuine and Spurious in Pearson and Yule , 1995 .

[89]  K. Goh,et al.  Exploring the human diseasome: the human disease network. , 2012, Briefings in functional genomics.

[90]  Chunhua Weng,et al.  Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research , 2013, J. Am. Medical Informatics Assoc..

[91]  Christopher G Chute,et al.  Analyzing the heterogeneity and complexity of Electronic Health Record oriented phenotyping algorithms. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[92]  Wei Zhu,et al.  A whole genome transcriptional analysis of the early immune response induced by live attenuated and inactivated influenza vaccines in young children. , 2010, Vaccine.

[93]  Intawat Nookaew,et al.  Comparative Systems Analyses Reveal Molecular Signatures of Clinically tested Vaccine Adjuvants , 2016, Scientific Reports.

[94]  I-Ming Wang,et al.  Pre-vaccination inflammation and B-cell signalling predict age-related hyporesponse to hepatitis B vaccination , 2016, Nature Communications.

[95]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[96]  D. Mutch Can molecular diagnostics usher in a new era for screening, diagnosis, and treatment of ovarian cancer? , 2014, Obstetrics and gynecology.

[97]  Guido Grandi,et al.  The impact of genomics in vaccine discovery: achievements and lessons , 2004, Expert review of vaccines.

[98]  J. Musser,et al.  Multi High-Throughput Approach for Highly Selective Identification of Vaccine Candidates: the Group A Streptococcus Case , 2012, Molecular & Cellular Proteomics.

[99]  H. Tettelin,et al.  Identification of a Universal Group B Streptococcus Vaccine by Multiple Genome Screen , 2005, Science.

[100]  J. Scannell,et al.  Diagnosing the decline in pharmaceutical R&D efficiency , 2012, Nature Reviews Drug Discovery.

[101]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[102]  Suzette J. Bielinski,et al.  Design and Anticipated Outcomes of the eMERGE-PGx Project: A Multi-Center Pilot for Pre-Emptive Pharmacogenomics in Electronic Health Record Systems , 2014, Clinical pharmacology and therapeutics.

[103]  Jeffrey Chang,et al.  Core services: Reward bioinformaticians , 2015, Nature.

[104]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[105]  Sabrina Giordano,et al.  hmmm: An R Package for Hierarchical Multinomial Marginal Models , 2014 .

[106]  Robert T. Chen,et al.  Emerging Vaccine Informatics , 2011, Journal of biomedicine & biotechnology.

[107]  L. Giudice,et al.  Molecular classification of endometriosis and disease stage using high-dimensional genomic data. , 2014, Endocrinology.

[108]  E. Topol,et al.  The Pathway to Patient Data Ownership and Better Health. , 2017, JAMA.

[109]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[110]  Andrew D. Ellington,et al.  Identification and characterization of the constituent human serum antibodies elicited by vaccination , 2014, Proceedings of the National Academy of Sciences.

[111]  Bart Penders,et al.  Bioinformatics: indispensable, yet hidden in plain sight? , 2017, BMC Bioinformatics.