Comparison and optimization of machine learning methods for automated classification of circulating tumor cells

Advances in rare cell capture technology have made possible the interrogation of circulating tumor cells (CTCs) captured from whole patient blood. However, locating captured cells in the device by manual counting bottlenecks data processing by being tedious (hours per sample) and compromises the results by being inconsistent and prone to user bias. Some recent work has been done to automate the cell location and classification process to address these problems, employing image processing and machine learning (ML) algorithms to locate and classify cells in fluorescent microscope images. However, the type of machine learning method used is a part of the design space that has not been thoroughly explored. Thus, we have trained four ML algorithms on three different datasets. The trained ML algorithms locate and classify thousands of possible cells in a few minutes rather than a few hours, representing an order of magnitude increase in processing speed. Furthermore, some algorithms have a significantly (P < 0.05) higher area under the receiver operating characteristic curve than do other algorithms. Additionally, significant (P < 0.05) losses to performance occur when training on cell lines and testing on CTCs (and vice versa), indicating the need to train on a system that is representative of future unlabeled data. Optimal algorithm selection depends on the peculiarities of the individual dataset, indicating the need of a careful comparison and optimization of algorithms for individual image classification tasks. © 2016 International Society for Advancement of Cytometry

[1]  Kenneth J. Pienta,et al.  Circulating Tumor Cells Predict Survival Benefit From Treatment in Metastatic Castration-Resistant Prostate Cancer Editorial Comment , 2009 .

[2]  Lior Shamir,et al.  Pattern Recognition Software and Techniques for Biological Image Analysis , 2010, PLoS Comput. Biol..

[3]  P. Gascoyne,et al.  Antibody-independent isolation of circulating tumor cells by continuous-flow dielectrophoresis. , 2013, Biomicrofluidics.

[4]  Peter Kuhn,et al.  High speed detection of circulating tumor cells. , 2006, Biosensors & bioelectronics.

[5]  Leon W.M.M. Terstappen,et al.  Optical tracking and detection of immunomagnetically selected and aligned cells , 1999, Nature Biotechnology.

[6]  Polina Golland,et al.  Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning , 2009, Proceedings of the National Academy of Sciences.

[7]  Deborah B. Thompson,et al.  An automated machine vision system for the histological grading of cervical intraepithelial neoplasia (CIN) , 2000, The Journal of pathology.

[8]  Jaap M. J. den Toonder,et al.  Circulating tumor cells: the Grand Challenge. , 2011, Lab on a chip.

[9]  Sunitha Nagrath,et al.  Microfluidics and cancer: are we there yet? , 2013, Biomedical microdevices.

[10]  Nicholas A. Hamilton,et al.  Fast automated cell phenotype image classification , 2007, BMC Bioinformatics.

[11]  Leon W. M. M. Terstappen,et al.  Unbiased and Automated Identification of a Circulating Tumour Cell Definition That Associates with Overall Survival , 2011, PloS one.

[12]  Peter Kuhn,et al.  Quantification of cellular volume and sub-cellular density fluctuations: comparison of normal peripheral blood cells and circulating tumor cells identified in a breast cancer patient , 2012, Front. Oncol..

[13]  Paul Fieguth,et al.  A probabilistic cell model in background corrected image sequences for single cell analysis , 2010, Biomedical engineering online.

[14]  P. Deb Finite Mixture Models , 2008 .

[15]  Cha-Mei Tang,et al.  The systematic study of circulating tumor cell isolation using lithographic microfilters. , 2014, RSC advances.

[16]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[17]  Massimo Cristofanilli,et al.  Considerations in the development of circulating tumor cell technology for clinical use , 2012, Journal of Translational Medicine.

[18]  J. Edward Jackson,et al.  A User's Guide to Principal Components: Jackson/User's Guide to Principal Components , 2004 .

[19]  Peng Li,et al.  Probing circulating tumor cells in microfluidics. , 2013, Lab on a chip.

[20]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[21]  Leon W. M. M. Terstappen,et al.  Circulating Tumor Cells Count and Morphological Features in Breast, Colorectal and Prostate Cancer , 2013, PloS one.

[22]  Jason P. Gleghorn,et al.  Capture of circulating tumor cells from whole blood of prostate cancer patients using geometrically enhanced differential immunocapture (GEDI) and a prostate-specific antibody. , 2010, Lab on a chip.

[23]  Anne E Carpenter,et al.  CellProfiler: free, versatile software for automated biological image analysis. , 2007, BioTechniques.

[24]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[25]  Zhongliang Tang,et al.  Efficient capture of circulating tumor cells with a novel immunocytochemical microfluidic device. , 2011, Biomicrofluidics.

[26]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[27]  Jason P. Gleghorn,et al.  Rare Cell Capture in Microfluidic Devices. , 2011, Chemical engineering science.

[28]  Yu Zhang,et al.  Isolation of Circulating Tumor Cells in Patients with Hepatocellular Carcinoma Using a Novel Cell Separation Strategy , 2011, Clinical Cancer Research.

[29]  Jan Greve,et al.  Automated identification of circulating tumor cells by image cytometry , 2012, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[30]  Tony J. Pircher,et al.  Detection of EpCAM-Negative and Cytokeratin-Negative Circulating Tumor Cells in Peripheral Blood , 2011, Journal of oncology.

[31]  A. Puisieux,et al.  Metastasis: a question of life or death , 2006, Nature Reviews Cancer.

[32]  O. McCarty,et al.  Optical Quantification of Cellular Mass, Volume, and Density of Circulating Tumor Cells Identified in an Ovarian Cancer Patient , 2012, Front. Oncol..

[33]  Carl-Magnus Svensson,et al.  Automated detection of circulating tumor cells with naive Bayesian classifiers , 2014, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[34]  Mahesan Niranjan,et al.  Realisable Classifiers: Improving Operating Performance on Variable Cost Problems , 1998, BMVC.

[35]  Jan Greve,et al.  CellTracks TDI: An image cytometer for cell characterization , 2011, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[36]  Marc Thilo Figge,et al.  Automated Classification of Circulating Tumor Cells and the Impact of Interobsever Variability on Classifier Training and Performance , 2015, Journal of immunology research.

[37]  Lior Shamir,et al.  Source Code for Biology and Medicine Open Access Wndchrm – an Open Source Utility for Biological Image Analysis , 2022 .

[38]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[39]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[40]  Pier Luigi Lopalco,et al.  Are the Two Human Papillomavirus Vaccines Really Similar? A Systematic Review of Available Evidence: Efficacy of the Two Vaccines against HPV , 2015, Journal of immunology research.

[41]  B. Efron,et al.  Bootstrap confidence intervals , 1996 .

[42]  Alison Stopeck,et al.  Circulating tumor cells: a novel prognostic factor for newly diagnosed metastatic breast cancer. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[43]  Michael Morse,et al.  Relationship of circulating tumor cells to tumor response, progression-free survival, and overall survival in patients with metastatic colorectal cancer. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[44]  E. Micheli-Tzanakou,et al.  A computational intelligence system for cell classification , 1998, Proceedings. 1998 IEEE International Conference on Information Technology Applications in Biomedicine, ITAB '98 (Cat. No.98EX188).

[45]  Mehmet Toner,et al.  Inertial Focusing for Tumor Antigen–Dependent and –Independent Sorting of Rare Circulating Tumor Cells , 2013, Science Translational Medicine.

[46]  E. Myers,et al.  A 3D Digital Atlas of C. elegans and Its Application To Single-Cell Analyses , 2009, Nature Methods.

[47]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[48]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[49]  Polina Golland,et al.  CellProfiler Analyst: data exploration and analysis software for complex image-based screens , 2008, BMC Bioinformatics.

[50]  Tim Wilhelm Nattkemper,et al.  An in situ probe for on‐line monitoring of cell density and viability on the basis of dark field microscopy in conjunction with image processing and supervised machine learning , 2007, Biotechnology and bioengineering.

[51]  Kenji Yasuda,et al.  Development of On-Chip Multi-Imaging Flow Cytometry for Identification of Imaging Biomarkers of Clustered Circulating Tumor Cells , 2014, PloS one.

[52]  Xiao-Jun Ma,et al.  Automated quantitative RNA in situ hybridization for resolution of equivocal and heterogeneous ERBB2 (HER2) status in invasive breast carcinoma. , 2013, The Journal of molecular diagnostics : JMD.

[53]  Timothy B Lannin,et al.  Microfluidic immunocapture of circulating pancreatic cells using parallel EpCAM and MUC1 capture: characterization, optimization and downstream analysis. , 2014, Lab on a chip.

[54]  Jonathan W. Uhr,et al.  Tumor Cells Circulate in the Peripheral Blood of All Major Carcinomas but not in Healthy Subjects or Patients With Nonmalignant Diseases , 2004, Clinical Cancer Research.

[55]  Paraskevi Giannakakou,et al.  Functional Characterization of Circulating Tumor Cells with a Prostate-Cancer-Specific Microfluidic Device , 2012, PloS one.

[56]  Ata Mahjoubfar,et al.  Deep Learning in Label-free Cell Classification , 2016, Scientific Reports.

[57]  Milos Hauskrecht,et al.  Learning classification models from multiple experts , 2013, J. Biomed. Informatics.

[58]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.