Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization

Visual morphology assessment is routinely used for evaluating of embryo quality and selecting human blastocysts for transfer after in vitro fertilization (IVF). However, the assessment produces different results between embryologists and as a result, the success rate of IVF remains low. To overcome uncertainties in embryo quality, multiple embryos are often implanted resulting in undesired multiple pregnancies and complications. Unlike in other imaging fields, human embryology and IVF have not yet leveraged artificial intelligence (AI) for unbiased, automated embryo assessment. We postulated that an AI approach trained on thousands of embryos can reliably predict embryo quality without human intervention. We implemented an AI approach based on deep neural networks (DNNs) to select highest quality embryos using a large collection of human embryo time-lapse images (about 50,000 images) from a high-volume fertility center in the United States. We developed a framework (STORK) based on Google’s Inception model. STORK predicts blastocyst quality with an AUC of >0.98 and generalizes well to images from other clinics outside the US and outperforms individual embryologists. Using clinical data for 2182 embryos, we created a decision tree to integrate embryo quality and patient age to identify scenarios associated with pregnancy likelihood. Our analysis shows that the chance of pregnancy based on individual embryos varies from 13.8% (age ≥41 and poor-quality) to 66.3% (age <37 and good-quality) depending on automated blastocyst quality assessment and patient age. In conclusion, our AI-driven approach provides a reproducible way to assess embryo quality and uncovers new, potentially personalized strategies to select embryos.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  J. Castilla,et al.  Inter-laboratory agreement on embryo classification and clinical decision: Conventional morphological assessment vs. time lapse , 2017, PloS one.

[3]  Minghao Chen,et al.  Does time-lapse imaging have favorable results for embryo incubation and selection compared with conventional methods in clinical in vitro fertilization? A meta-analysis and systematic review of randomized controlled trials , 2017, PloS one.

[4]  Weisheng Chen,et al.  Establishing Decision Trees for Predicting Successful Postpyloric Nasoenteric Tube Placement in Critically Ill Patients , 2016, JPEN. Journal of parenteral and enteral nutrition.

[5]  Ehsan Kazemi,et al.  Deep Convolutional Neural Networks Enable Discrimination of Heterogeneous Digital Pathology Images , 2017, bioRxiv.

[6]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  G D Adamson,et al.  International Committee for Monitoring Assisted Reproductive Technologies world report: Assisted Reproductive Technology 2008, 2009 and 2010. , 2016, Human reproduction.

[8]  Nikica Zaninovic,et al.  An atlas of human blastocysts , 2003 .

[9]  Martine Hébert,et al.  Factors linked to outcomes in sexually abused girls: a regression tree analysis. , 2006, Comprehensive psychiatry.

[10]  Interobserver agreement and intraobserver reproducibility of embryo quality assessments , 2006 .

[11]  D. Gardner,et al.  Blastocyst score affects implantation and pregnancy outcome: towards a single blastocyst transfer. , 2000, Fertility and sterility.

[12]  E. Santos Filho,et al.  A Review on Automatic Analysis of Human Embryo Microscope Images , 2010, The open biomedical engineering journal.

[13]  P. Patrizio,et al.  Infertility around the globe: new thinking on gender, reproductive technologies and global movements in the 21st century. , 2015, Human reproduction update.

[14]  J A Noble,et al.  A method for semi-automatic grading of human blastocyst microscope images. , 2012, Human reproduction.

[15]  José Celso Rocha,et al.  A Method Based on Artificial Intelligence To Fully Automatize The Evaluation of Bovine Blastocyst Images , 2017, Scientific Reports.

[16]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[17]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[18]  José Celso Rocha,et al.  A method using artificial neural networks to morphologically assess mouse blastocyst quality , 2014, Journal of animal science and technology.

[19]  Christos A. Venetis,et al.  Inter-observer and intra-observer agreement between embryologists during selection of a single Day 5 embryo for transfer: a multicenter study , 2017, Human reproduction.

[20]  Ying LU,et al.  Decision tree methods: applications for classification and prediction , 2015, Shanghai archives of psychiatry.

[21]  Yun Tian,et al.  Predicting pregnancy rate following multiple embryo transfers using algorithms developed through static image analysis. , 2017, Reproductive biomedicine online.

[22]  Elizabeth Hervey Stephen,et al.  Infertility and impaired fecundity in the United States, 1982-2010: data from the National Survey of Family Growth. , 2013, National health statistics reports.

[23]  Karen Turner,et al.  Grade of the inner cell mass, but not trophectoderm, predicts live birth in fresh blastocyst single transfers , 2016, Human fertility.

[24]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[25]  Osamu Ishihara,et al.  International Committee for Monitoring Assisted Reproductive Technologies world report: Assisted Reproductive Technology 2006. , 2013, Human reproduction.

[26]  Subhamoy Mandal,et al.  Grading of mammalian cumulus oocyte complexes using machine learning for in vitro embryo culture , 2016, 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[27]  Loris Nanni,et al.  Artificial intelligence techniques for embryo and oocyte classification. , 2013, Reproductive biomedicine online.

[28]  Irene Cuevas Saiz,et al.  The Embryology Interest Group: updating ASEBIR's morphological scoring system for early embryos, morulae and blastocysts , 2018 .

[29]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[30]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[31]  D. Griffin,et al.  The origin, mechanisms, incidence and clinical consequences of chromosomal mosaicism in humans. , 2014, Human reproduction update.

[32]  Gary S Collins,et al.  Selection of single blastocysts for fresh transfer via standard morphology assessment alone and with array CGH for good prognosis IVF patients: results from a randomized pilot study , 2012, Molecular Cytogenetics.

[33]  T. D’Hooghe,et al.  Semi-automated morphometric analysis of human embryos can reveal correlations between total embryo volume and clinical pregnancy. , 2013, Human reproduction.

[34]  Nikica Zaninovic,et al.  Morphologic grading of euploid blastocysts influences implantation and ongoing pregnancy rates. , 2017, Fertility and sterility.

[35]  Linda Sundvall,et al.  Inter- and intra-observer variability of time-lapse annotations. , 2013, Human reproduction.

[36]  Parvaneh Saeedi,et al.  Automatic Identification of Human Blastocyst Components via Texture , 2017, IEEE Transactions on Biomedical Engineering.

[37]  Tatiana Puga-Torres,et al.  Blastocyst classification systems used in Latin America: is a consensus possible? , 2017, JBRA assisted reproduction.

[38]  K. Iwata,et al.  Deep learning based on images of human embryos obtained from high-resolusion time-lapse cinematography for predicting good-quality embryos , 2018, Fertility and Sterility.

[39]  Alice A. Chen,et al.  Improving embryo selection using a computer-automated time-lapse image analysis test plus day 3 morphology: results from a prospective multicenter trial. , 2013, Fertility and sterility.

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Ecevit Eyduran,et al.  Comparison of artificial neural network and decision tree algorithms used for predicting live weight at post weaning period from some biometrical characteristics in Harnai sheep. , 2015 .

[42]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[43]  Raphaël Marée,et al.  Phenotype Classification of Zebrafish Embryos by Supervised Learning , 2012, PloS one.

[44]  Matthew Zawistowski,et al.  Corrected ROC analysis for misclassified binary outcomes. , 2017, Statistics in medicine.

[45]  S. Thrun,et al.  Corrigendum: Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[46]  Alan S. Penzias,et al.  Evaluation of a high implantation potential (HIP) embryo grading system designed to reduce multiple pregnancy , 2016 .

[47]  M. Abràmoff,et al.  Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning. , 2016, Investigative ophthalmology & visual science.

[48]  Yair Movshovitz-Attias,et al.  Ontological supervision for fine grained classification of Street View storefronts , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).