Mlf-core: a Framework for Deterministic Machine Learning

Machine learning has shown extensive growth in recent years1. However, previously existing studies highlighted a reproducibility crisis in machine learning2. The reasons for irreproducibility are manifold. Major machine learning libraries default to the usage of non-deterministic algorithms based on atomic operations. Solely fixing all random seeds is not sufficient for deterministic machine learning. To overcome this shortcoming, various machine learning libraries released deterministic counterparts to the non-deterministic algorithms. We evaluated the effect of these algorithms on determinism and runtime. Based on these results, we formulated a set of requirements for reproducible machine learning and developed a new software solution, the mlf-core ecosystem, which aids machine learning projects to meet and keep these requirements. We applied mlf-core to develop fully reproducible models in various biomedical fields including a single cell autoencoder with TensorFlow, a PyTorch-based U-Net model for liver-tumor segmentation in CT scans, and a liver cancer classifier based on gene expression profiles with XGBoost.

[1]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[2]  S. Dwivedi,et al.  Obesity May Be Bad: Compressed Convolutional Networks for Biomedical Image Segmentation , 2020 .

[3]  G. Kempermann Faculty Opinions recommendation of Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. , 2015 .

[4]  Trevor Hastie,et al.  Transparency and reproducibility in artificial intelligence , 2020, Nature.

[5]  Martin Styner,et al.  Comparison and Evaluation of Methods for Liver Segmentation From CT Datasets , 2009, IEEE Transactions on Medical Imaging.

[6]  Gregory P. Way,et al.  Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas , 2018, Cell reports.

[7]  Kovila P. L. Coopamootoo,et al.  Technologies for Trustworthy Machine Learning: A Survey in a Socio-Technical Context , 2020, ArXiv.

[8]  Christian S. Collberg,et al.  Repeatability in computer systems research , 2016, Commun. ACM.

[9]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[10]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[11]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[12]  M. Blachier,et al.  Report Title: The burden of liver disease in Europe: a review of available epidemiological data , 2013 .

[13]  Peter Ahrens,et al.  Efficient Reproducible Floating Point Summation and BLAS , 2015 .

[14]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Hong Diep Nguyen,et al.  Algorithms for Efficient Reproducible Floating Point Summation , 2020, ACM Trans. Math. Softw..

[16]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[17]  Andreas Ziegler,et al.  Consumer credit risk: Individual probability estimates using machine learning , 2013, Expert Syst. Appl..

[18]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[19]  Peter Stone,et al.  Deterministic Implementations for Reproducibility in Deep Reinforcement Learning , 2018, ArXiv.

[20]  Sven Nahnsen,et al.  The nf-core framework for community-curated bioinformatics pipelines , 2020, Nature Biotechnology.

[21]  Karen Kafadar,et al.  Letter-Value Plots: Boxplots for Large Data , 2017 .

[22]  H. El‐Serag,et al.  Hepatocellular carcinoma: epidemiology and molecular carcinogenesis. , 2007, Gastroenterology.

[23]  Sagar,et al.  FateID infers cell fate bias in multipotent progenitors from single-cell RNA-seq data , 2017, Nature Methods.

[24]  Thomas Brox,et al.  3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation , 2016, MICCAI.

[25]  Hao Chen,et al.  The Liver Tumor Segmentation Benchmark (LiTS) , 2019, Medical Image Anal..

[26]  Jeffrey T Leek,et al.  Reproducible RNA-seq analysis using recount2 , 2017, Nature Biotechnology.

[27]  M. Hutson Artificial intelligence faces reproducibility crisis. , 2018, Science.

[28]  I. Kohane,et al.  Big Data and Machine Learning in Health Care. , 2018, JAMA.

[29]  Hans Meine,et al.  Automatic liver tumor segmentation in CT with fully convolutional neural networks and object-based postprocessing , 2018, Scientific Reports.

[30]  Hong-Dong Li,et al.  GTFtools: a Python package for analyzing various modes of gene models , 2018, bioRxiv.

[31]  Loic A. Royer,et al.  Applications, Promises, and Pitfalls of Deep Learning for Fluorescence Image Reconstruction , 2018 .

[32]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[33]  Marieke L. Kuijjer,et al.  Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas , 2019, PLoS Comput. Biol..

[34]  Weilai Chi,et al.  Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data , 2020, Genes.

[35]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[36]  Feng Bai-ming Research on error accumulative sum of single precision floating point , 2013 .

[37]  Mohammad Lotfollahi,et al.  scGen predicts single-cell perturbation responses , 2019, Nature Methods.

[38]  Fabian J Theis,et al.  Single-cell RNA-seq denoising using a deep count autoencoder , 2019, Nature Communications.

[39]  Odd Erik Gundersen,et al.  State of the Art: Reproducibility in Artificial Intelligence , 2018, AAAI.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Changming Sun,et al.  RA-UNet: A Hybrid Deep Attention-Aware Network to Extract Liver and Tumor in CT Scans , 2018, Frontiers in Bioengineering and Biotechnology.

[42]  Pearl Brereton,et al.  Reproducibility of studies on text mining for citation screening in systematic reviews: Evaluation and checklist , 2017, J. Biomed. Informatics.

[43]  Vincent A. Traag,et al.  From Louvain to Leiden: guaranteeing well-connected communities , 2018, Scientific Reports.

[44]  Tushar Gupta,et al.  Crime detection and criminal identification in India using data mining techniques , 2014, AI & SOCIETY.

[45]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[46]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[47]  L. Schwartz,et al.  New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). , 2009, European journal of cancer.

[48]  Rui Li,et al.  Imputation of single-cell gene expression with an autoencoder neural network , 2018, bioRxiv.

[49]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.