An artificial intelligent platform for live cell identification and the detection of cross-contamination.

Background About 30% of cell lines have been cellular cross-contaminated and misidentification, which can result in invalidated experimental results and unusable therapeutic products. Cell morphology under the microscope was observed routinely, and further DNA sequencing analysis was performed periodically to verify cell line identity, but the sequencing analysis was costly, time-consuming, and labor intensive. The purpose of this study was to construct a novel artificial intelligence (AI) technology for "cell face" recognition, in which can predict DNA-level identification labels only using cell images. Methods Seven commonly used cell lines were cultured and co-cultured in pairs (totally 8 categories) to simulated the situation of pure and cross-contaminated cells. The microscopy images were obtained and labeled of cell types by the result of short tandem repeat profiling. About 2 million patch images were used for model training and testing. AlexNet was used to demonstrate the effectiveness of convolutional neural network (CNN) in cell classification. To further improve the feasibility of detecting cross-contamination, the bilinear network for fine-grained identification was constructed. The specificity, sensitivity, and accuracy of the model were tested separately by external validation. Finally, the cell semantic segmentation was conducted by DilatedNet. Results The cell texture and density were the influencing factors that can be better recognized by the bilinear convolutional neural network (BCNN) comparing to AlexNet. The BCNN achieved 99.5% accuracy in identifying seven pure cell lines and 86.3% accuracy for detecting cross-contamination (mixing two of the seven cell lines). DilatedNet was applied to the semantic segment for analyzing in single-cell level and achieved an accuracy of 98.2%. Conclusions The deep CNN model proposed in this study has the ability to recognize small differences in cell morphology, and achieved high classification accuracy.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  S. Horbach,et al.  The ghosts of HeLa: How cell line misidentification contaminates the scientific literature , 2017, PloS one.

[3]  Eberhard Lignitz,et al.  Evaluation of allelic alterations in short tandem repeats in different kinds of solid tumors--possible pitfalls in forensic casework. , 2004, Forensic science international.

[4]  M. Pawlita,et al.  High‐throughput SNP‐based authentication of human cell lines , 2013, International journal of cancer.

[5]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[6]  Bernd Fischer,et al.  CellCognition: time-resolved phenotype annotation in high-throughput live cell imaging , 2010, Nature Methods.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  H. Drexler,et al.  Cytogenetic harvesting of commonly used tumor cell lines , 2007, Nature Protocols.

[10]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[11]  Francis S. Collins,et al.  Fixing problems with cell lines , 2014, Science.

[12]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jian Guo,et al.  Deep CNN Ensemble with Data Augmentation for Object Detection , 2015, ArXiv.

[14]  J. Neimark Line of attack. , 2015, Science.

[15]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  John P. Didion,et al.  SNP array profiling of mouse cell lines identifies their strains of origin and reveals cross-contamination and widespread aneuploidy , 2014, BMC Genomics.

[17]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[18]  A. Plant,et al.  Standards for Cell Line Authentication and Beyond , 2016, PLoS biology.

[19]  R. Lande NATURAL SELECTION AND RANDOM GENETIC DRIFT IN PHENOTYPIC EVOLUTION , 1976, Evolution; international journal of organic evolution.

[20]  S. Gartler,et al.  Apparent HeLa Cell Contamination of Human Heteroploid Cell Lines , 1968, Nature.

[21]  Subhransu Maji,et al.  Bilinear CNNs for Fine-grained Visual Recognition , 2015 .

[22]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Amanda Capes-Davis,et al.  Check your cultures! A list of cross‐contaminated or misidentified cell lines , 2010, International journal of cancer.

[24]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[25]  Xie Yong-hua,et al.  Study on the identification of the wood surface defects based on texture features , 2015 .

[26]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Xiaoling Tang,et al.  Rapid identification and authentication of closely related animal cell culture by polymerase chain reaction , 2008, In Vitro Cellular & Developmental Biology - Animal.

[28]  Fabian J Theis,et al.  Prospective identification of hematopoietic lineage choice by deep learning , 2017, Nature Methods.

[29]  Sudhir Varma,et al.  DNA fingerprinting of the NCI-60 cell line panel , 2009, Molecular Cancer Therapeutics.

[30]  Roland M. Nardone,et al.  Eradication of cross-contaminated cell lines: A call for action , 2007, Cell Biology and Toxicology.

[31]  H. Parkes,et al.  The costs of using unauthenticated, over-passaged cell lines: how much more data do we need? , 2007, BioTechniques.