Knowledge transfer to enhance the performance of deep learning models for automated classification of B cell neoplasms

Summary Multi-parameter flow cytometry (MFC) is a cornerstone in clinical decision making for leukemia and lymphoma. MFC data analysis requires manual gating of cell populations, which is time-consuming, subjective, and often limited to a two-dimensional space. In recent years, deep learning models have been successfully used to analyze data in high-dimensional space and are highly accurate. However, AI models used for disease classification with MFC data are limited to the panel they were trained on. Thus, a key challenge in deploying AI into routine diagnostics is the robustness and adaptability of such models. This study demonstrates how transfer learning can be applied to boost the performance of models with smaller datasets acquired with different MFC panels. We trained models for four additional datasets by transferring the features learned from our base model. Our workflow increased the model's overall performance and, more prominently, improved the learning rate for small training sizes.

[1]  Marcel J. T. Reinders,et al.  CyTOFmerge: integrating mass cytometry data across multiple panels , 2019, Bioinform..

[2]  N. Radakovich,et al.  Artificial Intelligence in Hematology: Current Challenges and Opportunities , 2020, Current Hematologic Malignancy Reports.

[3]  Bo Thiesson,et al.  The Learning-Curve Sampling Method Applied to Model-Based Clustering , 2002, J. Mach. Learn. Res..

[4]  E S Costa,et al.  Automated pattern-guided principal component analysis vs expert-based immunophenotypic classification of B-cell chronic lymphoproliferative disorders: a step forward in the standardization of clinical immunophenotyping , 2010, Leukemia.

[5]  David P Ng,et al.  Augmented Human Intelligence and Automated Diagnosis in Flow Cytometry for Hematologic Malignancies. , 2020, American journal of clinical pathology.

[6]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[8]  A Orfao,et al.  EuroFlow antibody panels for standardized n-dimensional flow cytometric immunophenotyping of normal, reactive and malignant leukocytes , 2012, Leukemia.

[9]  C. E. Pedreira,et al.  Generation of flow cytometry data files with a potentially infinite number of dimensions , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[10]  William Finn,et al.  Statistical file matching of flow cytometry data , 2010, J. Biomed. Informatics.

[12]  Arnon Nagler,et al.  Machine learning and artificial intelligence in haematology , 2020, British journal of haematology.

[13]  J. Schmitz,et al.  Basic Theory and Clinical Applications of Flow Cytometry , 2007 .

[14]  Sean C. Bendall,et al.  From single cells to deep phenotypes in cancer , 2012, Nature Biotechnology.

[15]  Tariq Samad,et al.  Self–organization with partial data , 1992 .

[16]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  P N Dean,et al.  Introduction to flow cytometry data file standard. , 1990, Cytometry.

[19]  R. Advani,et al.  The World Health Organization Classification of Lymphoid Neoplasms , 2013 .

[20]  J. Paul Robinson,et al.  An innovation in flow cytometry data collection and analysis producing a correlated multiple sample analysis in a single file. , 1991, Cytometry.

[21]  Nima Aghaeepour,et al.  Flow Cytometry Bioinformatics , 2013, PLoS Comput. Biol..

[22]  K. Spiekermann,et al.  Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks , 2019, Nat. Mach. Intell..

[23]  Nima Aghaeepour,et al.  Deep profiling of multitube flow cytometry data , 2015, Bioinform..

[24]  Gilles Louppe,et al.  Independent consultant , 2013 .

[25]  Tara Javidi,et al.  Extrinsic Jensen–Shannon Divergence: Applications to Variable-Length Coding , 2013, IEEE Transactions on Information Theory.

[26]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[27]  F. Craig,et al.  Flow cytometric immunophenotyping for hematologic neoplasms. , 2008, Blood.