A novel machine learning based approach for iPS progenitor cell identification

Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biomarkers known for early progenitor cells, and only about 6 days after reprogramming initiation, iPS cells can be experimentally determined via fluorescent probes. What is more, the ratio of progenitor cells during early reprograming period is below 5%, which is too low to capture experimentally in the early stage. In this paper, we propose a novel computational approach for the identification of iPS progenitor cells based on machine learning and microscopic image analysis. Firstly, we record the reprogramming process using a live cell imaging system after 48 hours of infection with retroviruses expressing Oct4, Sox2 and Klf4, later iPS progenitor cells and normal murine embryonic fibroblasts (MEFs) within 3 to 5 days after infection are labeled by retrospectively tracing the time-lapse microscopic image. We then calculate 11 types of cell morphological and motion features such as area, speed, etc., and select best time windows for modeling and perform feature selection. Finally, a prediction model using XGBoost is built based on the selected six types of features and best time windows. Our model allows several missing values/frames in the sample datasets, thus it is applicable to a wide range of scenarios. Cross-validation, holdout validation and independent test experiments show that the minimum precision is above 52%, that is, the ratio of predicted progenitor cells within 3 to 5 days after viral infection is above 52%. The results also confirm that the morphology and motion pattern of iPS progenitor cells is different from that of normal MEFs, which helps with the machine learning methods for iPS progenitor cell identification.

[1]  S. Yamanaka,et al.  Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors , 2006, Cell.

[2]  T. Ichisaka,et al.  Generation of germline-competent induced pluripotent stem cells , 2007, Nature.

[3]  Mike J. Mason,et al.  Role of the Murine Reprogramming Factors in the Induction of Pluripotency , 2009, Cell.

[4]  T. Ichisaka,et al.  Suppression of induced pluripotent stem cell generation by the p53–p21 pathway , 2009, Nature.

[5]  Zachary D. Smith,et al.  Dynamic single-cell imaging of direct reprogramming reveals an early specifying event , 2010, Nature Biotechnology.

[6]  Jialiang Liang,et al.  A mesenchymal-to-epithelial transition initiates and is required for the nuclear reprogramming of mouse fibroblasts. , 2010, Cell stem cell.

[7]  S. Yamanaka,et al.  Transient activation of c-MYC expression is critical for efficient platelet generation from human induced pluripotent stem cells , 2010, The Journal of experimental medicine.

[8]  Wiro J. Niessen,et al.  Advanced Level-Set-Based Cell Tracking in Time-Lapse Fluorescence Microscopy , 2010, IEEE Transactions on Medical Imaging.

[9]  H. Redl,et al.  Vitamin C enhances the generation of mouse and human induced pluripotent stem cells. , 2010, Cell stem cell.

[10]  D. Surmeier,et al.  Floor plate-derived dopamine neurons from hESCs efficiently engraft in animal models of PD , 2011, Nature.

[11]  Jean-Christophe Olivo-Marin,et al.  3-D Active Meshes: Fast Discrete Deformable Models for Cell Tracking in 3-D Time-Lapse Microscopy , 2011, IEEE Transactions on Image Processing.

[12]  N. Benvenisty,et al.  The tumorigenicity of human embryonic and induced pluripotent stem cells , 2011, Nature Reviews Cancer.

[13]  J. Chen,et al.  Rational optimization of reprogramming culture conditions for the generation of induced pluripotent stem cells with ultra-high efficiency and fast kinetics , 2011, Cell Research.

[14]  H. Okano,et al.  Grafted human-induced pluripotent stem-cell–derived neurospheres promote motor functional recovery after spinal cord injury in mice , 2011, Proceedings of the National Academy of Sciences.

[15]  Masayo Takahashi,et al.  Induction of retinal pigment epithelial cells from monkey iPS cells. , 2011, Investigative ophthalmology & visual science.

[16]  G. Daley,et al.  Metabolic regulation in pluripotent stem cells during reprogramming and self-renewal. , 2012, Cell stem cell.

[17]  Kristopher L. Nazor,et al.  Probing sporadic and familial Alzheimer’s disease using induced pluripotent stem cells , 2012, Nature.

[18]  S. Yamanaka Induced pluripotent stem cells: past, present, and future. , 2012, Cell stem cell.

[19]  S. Ramaswamy,et al.  A Molecular Roadmap of Reprogramming Somatic Cells into iPS Cells , 2012, Cell.

[20]  M. Boutros,et al.  Systematic approaches to dissect biological processes in stem cells by image‐based screening , 2012, Biotechnology journal.

[21]  Shangqin Guo,et al.  Dynamic Migration and Cell‐Cell Interactions of Early Reprogramming Revealed by High‐Resolution Time‐Lapse Imaging , 2013, Stem cells.

[22]  Carlos Ortiz-de-Solorzano,et al.  Segmentation and Shape Tracking of Whole Fluorescent Cells Based on the Chan–Vese Model , 2013, IEEE Transactions on Medical Imaging.

[23]  J. Shu,et al.  Induction of Pluripotency in Mouse Somatic Cells with Lineage Specifiers , 2013, Cell.

[24]  P. Gao,et al.  Human Fibroblast Reprogramming to Pluripotent Stem Cells Regulated by the miR19a/b-PTEN Axis , 2014, PloS one.

[25]  M. Mandai,et al.  Tumorigenicity Studies of Induced Pluripotent Stem Cell (iPSC)-Derived Retinal Pigment Epithelium (RPE) for the Treatment of Age-Related Macular Degeneration , 2014, PloS one.

[26]  Michael J. Ziller,et al.  Integrative Analyses of Human Reprogramming Reveal Dynamic Nature of Induced Pluripotency , 2015, Cell.

[27]  Ata Mahjoubfar,et al.  Deep Learning in Label-free Cell Classification , 2016, Scientific Reports.

[28]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[29]  Euan A. Ashley,et al.  Deep Learning Automates the Quantitative Analysis of Individual Cells in Live-Cell Imaging Experiments , 2016, PLoS Comput. Biol..

[30]  D. Lim,et al.  Genetic ablation of the mammalian sterile-20 like kinase 1 (Mst1) improves cell reprogramming efficiency and increases induced pluripotent stem cell proliferation and survival , 2017, Stem cell research.

[31]  Yolanda T. Chong,et al.  Automated analysis of high‐content microscopy data with deep learning , 2017, Molecular systems biology.

[32]  J. Aerts,et al.  SCENIC: Single-cell regulatory network inference and clustering , 2017, Nature Methods.

[33]  Pascal Fua,et al.  Network Flow Integer Programming to Track Elliptical Cells in Time-Lapse Sequences , 2017, IEEE Transactions on Medical Imaging.

[34]  C. L. Philip Chen,et al.  I-Ching Divination Evolutionary Algorithm and its Convergence Analysis , 2017, IEEE Transactions on Cybernetics.

[35]  Lei Wang,et al.  HEp-2 Cell Image Classification With Deep Convolutional Neural Networks , 2015, IEEE Journal of Biomedical and Health Informatics.

[36]  R. Beijersbergen,et al.  TRIM28 Is an Epigenetic Barrier to Induced Pluripotent Stem Cell Reprogramming , 2017, Stem cells.

[37]  Tong Zhang,et al.  Design of Highly Nonlinear Substitution Boxes Based on I-Ching Operators , 2018, IEEE Transactions on Cybernetics.

[38]  Minzhu Xie,et al.  XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction , 2018, IEEE Transactions on NanoBioscience.

[39]  Ju-Hyun Lee,et al.  Suppression of the ERK–SRF axis facilitates somatic cell reprogramming , 2018, Experimental & Molecular Medicine.

[40]  Zhiwen Liu,et al.  Cell dynamic morphology classification using deep convolutional neural networks , 2018, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[41]  Thomas Neff,et al.  Instance Segmentation and Tracking with Cosine Embeddings and Recurrent Hourglass Networks , 2018, MICCAI.

[42]  J. Utikal,et al.  Imidazopyridines as Potent KDM5 Demethylase Inhibitors Promoting Reprogramming Efficiency of Human iPSCs , 2019, iScience.

[43]  Patrick S. Stumpf,et al.  Machine Learning of Stem Cell Identities From Single-Cell Expression Data via Regulatory Network Archetypes , 2019, Front. Genet..