9 A Scalable Architecture for Smart Genomic Data Analysis in Medical Laboratories

Genomic data is an important building block for the era of personalized medicine. However, processing this data efficiently in diagnostic laboratories faces several challenges in distinct areas such as big data, artificial intelligence, regulatory environment, medical/diagnostic standards (evolving guidelines), and software requirements engineering. Analysis of the state of the art in these areas shows promising approaches and suitable reference models but no direct solutions. Existing technical solutions for genomic data analysis tend to be specialized for research projects and do not take into account the requirements for routine medical diagnostics including the regulatory constraints in this area. This chapter introduces a technical architecture for the GenDAI (Genomic applications for laboratory Diagnostics supported by Artificial Intelligence) project that aims to create a platform for genomic data analysis that is specifically tailored to the needs and requirements of laboratory diagnostics. This includes the automation of processes using data analysis pipelines and artificial intelligence.

[1]  M. Hemmje,et al.  A Preliminary Evaluation of “GenDAI”, an AI-Assisted Laboratory Diagnostics Solution for Genomic Applications , 2022, BioMedInformatics.

[2]  M. Hemmje,et al.  A Systematic Approach to Diagnostic Laboratory Software Requirements Analysis , 2022, Bioengineering.

[3]  C. Nicco,et al.  Microbiota medicine: towards clinical revolution , 2022, Journal of translational medicine.

[4]  M. Hemmje,et al.  GenDAI – AI-Assisted Laboratory Diagnostics for Genomic Applications , 2021, 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5]  Jyotsna Talreja Wassan,et al.  Analyzing Large Microbiome Datasets Using Machine Learning and Big Data , 2021, BioMedInformatics.

[6]  C. Meisel,et al.  Laboratory-Developed Tests: Design of a Regulatory Strategy in Compliance with the International State-of-the-Art and the Regulation (EU) 2017/746 (EU IVDR [In Vitro Diagnostic Medical Device Regulation]) , 2021, Therapeutic Innovation & Regulatory Science.

[7]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[8]  B. Chassaing,et al.  Host/microbiota interactions in health and diseases—Time for mucosal microbiology! , 2021, Mucosal Immunology.

[9]  T. Dinan,et al.  Diet and the Microbiota-Gut-Brain Axis: Sowing the Seeds of Good Mental Health. , 2021, Advances in nutrition.

[10]  Shu-juan Xie,et al.  RNA sequencing: new technologies and applications in cancer research , 2020, Journal of Hematology & Oncology.

[11]  Grace E. Adams A beginner’s guide to RT-PCR, qPCR and RT-qPCR , 2020, The Biochemist.

[12]  Yang Bai,et al.  A practical guide to amplicon and metagenomic analysis of microbiome data , 2020, Protein & Cell.

[13]  D. Figeys,et al.  Advancing functional and translational microbiome research using meta-omics approaches , 2019, Microbiome.

[14]  Ali Torkamani,et al.  Artificial intelligence in clinical and genomic diagnostics , 2019, Genome Medicine.

[15]  L. Ivashkiv,et al.  Interferon target-gene expression and epigenomic signatures in health and disease , 2019, Nature Immunology.

[16]  Huiru Zheng,et al.  Phy-PMRFI: Phylogeny-Aware Prediction of Metagenomic Functions Using Random Forest Feature Importance , 2019, IEEE Transactions on NanoBioscience.

[17]  Charles Y Chiu,et al.  Clinical metagenomics , 2019, Nature Reviews Genetics.

[18]  Maurice H. T. Ling,et al.  Advancing Personalized Medicine Through the Application of Whole Exome Sequencing and Big Data Analytics , 2019, Front. Genet..

[19]  Peter M. Krawitz,et al.  Identifying facial phenotypes of genetic disorders using deep learning , 2019, Nature Medicine.

[20]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[21]  Nicholas J Schork,et al.  Personalized medicine: motivation, challenges, and progress. , 2018, Fertility and sterility.

[22]  Daniel J. Blankenberg,et al.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update , 2018, Nucleic Acids Res..

[23]  Courtney R. Armour,et al.  A Metagenomic Meta-analysis Reveals Functional Signatures of Health and Disease in the Human Gut Microbiome , 2018, mSystems.

[24]  Antonio Coronato,et al.  IEC 62304: medical device software - software life-cycle processes , 2018 .

[25]  Aristotelis Tsirigos,et al.  Classification and Mutation Prediction from Non-Small Cell Lung Cancer Histopathology Images using Deep Learning , 2017, bioRxiv.

[26]  Cesare Furlanello,et al.  Phylogenetic convolutional neural networks in metagenomics , 2017, BMC Bioinformatics.

[27]  John Chilton,et al.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update , 2016, Nucleic Acids Res..

[28]  G. Bos Medical devices. Quality management systems. Requirements for regulatory purposes , 2016 .

[29]  Jacques Ravel,et al.  The vocabulary of microbiome research: a proposal , 2015, Microbiome.

[30]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[31]  Jemal H. Abawajy,et al.  Comprehensive analysis of big data variety landscape , 2015, Int. J. Parallel Emergent Distributed Syst..

[32]  Umair Shafique,et al.  A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA) , 2014 .

[33]  David Bernstein,et al.  Containers and Cloud: From LXC to Docker to Kubernetes , 2014, IEEE Cloud Computing.

[34]  Stefan Rödiger,et al.  Biomolecular Detection and Quantification , 2022 .

[35]  Adam M. Phillippy,et al.  Interactive metagenomic visualization in a Web browser , 2011, BMC Bioinformatics.

[36]  Folker Meyer,et al.  37. The Metagenomics RAST Server: A Public Resource for the Automatic Phylogenetic and Functional Analysis of Metagenomes , 2011 .

[37]  P. Bork,et al.  A human gut microbial gene catalogue established by metagenomic sequencing , 2010, Nature.

[38]  Lucila Ohno-Machado,et al.  A primer on gene expression and microarrays for machine learning researchers , 2004, J. Biomed. Informatics.

[39]  H Richardson,et al.  Medical Laboratories — Requirements for Quality and Competence: An ISO Perspective , 2002, Vox sanguinis.

[40]  J. Nunamaker,et al.  Systems development in information systems research , 1990, Twenty-Third Annual Hawaii International Conference on System Sciences.

[41]  D. Lipman,et al.  National Center for Biotechnology Information , 2019, Springer Reference Medizin.

[42]  M. Hemmje,et al.  AI2VIS4BigData: A Reference Model for AI-Based Big Data Analysis and Visualization , 2020, AVI-BDA/ITAVIS@AVI.

[43]  Sanchez Martin Jose Ignacio,et al.  Robustness and Explainability of Artificial Intelligence , 2020 .

[44]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[45]  Thomas Reinartz,et al.  CRISP-DM 1.0: Step-by-step data mining guide , 2000 .

[46]  Derek Partridge,et al.  Problem description and hypotheses testing in Artificial Intelligence , 1991 .