A New Classification of Benign, Premalignant, and Malignant Endometrial Tissues Using Machine Learning Applied to 1413 Candidate Variables

Supplemental Digital Content is available in the text. Benign normal (NL), premalignant (endometrial intraepithelial neoplasia, EIN) and malignant (cancer, EMCA) endometria must be precisely distinguished for optimal management. EIN was objectively defined previously as a regression model incorporating manually traced histologic variables to predict clonal growth and cancer outcomes. Results from this early computational study were used to revise subjective endometrial precancer diagnostic criteria currently in use. We here use automated feature segmentation and updated machine learning algorithms to develop a new classification algorithm. Endometrial tissue from 148 patients was randomly separated into 72-patient training and 76-patient validation cohorts encompassing all 3 diagnostic classes. We applied image analysis software to keratin stained endometrial tissues to automatically segment whole-slide digital images into epithelium, cells, and nuclei and extract corresponding variables. A total of 1413 variables were culled to 75 based on random forest classification performance in a 3-group (NL, EIN, EMCA) model. This algorithm correctly classifies cases with 3-class error rates of 0.04 (training set) and 0.058 (validation set); and 2-class (NL vs. EIN+EMCA) error rate of 0.016 (training set) and 0 (validation set). The 4 most heavily weighted variables are surrogates of those previously identified in manual-segmentation machine learning studies (stromal and epithelial area percentages, and normalized epithelial surface lengths). Lesser weighted predictors include gland and lumen axis lengths and ratios, and individual cell measures. Automated image analysis and random forest classification algorithms can classify normal, premalignant, and malignant endometrial tissues. Highest predictive variables overlap with those discovered independently in early models based on manual segmentation.

[1]  P D Bezemer,et al.  Architectural and nuclear morphometrical features together are more important prognosticators in endometrial hyperplasias than nuclear morphometrical features alone , 1988, The Journal of pathology.

[2]  Andrew H. Beck,et al.  Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival , 2011, Science Translational Medicine.

[3]  G. Mutter,et al.  Uteri of women with endometrial carcinoma contain a histopathological spectrum of monoclonal putative precancers, some with microsatellite instability. , 1996, Cancer research.

[4]  Charis Eng,et al.  Squamous morules are functionally inert elements of premalignant endometrial neoplasia , 2009, Modern Pathology.

[5]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[6]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[7]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[8]  Jenny Lee,et al.  Fully Automated Deep Learning System for Bone Age Assessment , 2017, Journal of Digital Imaging.

[9]  Karen Lu,et al.  Management of Endometrial Precancers , 2012, Obstetrics and gynecology.

[10]  T. Stephenson,et al.  Manual of Quantitative Pathology in Cancer Diagnosis and Prognosis , 1992 .

[11]  P. V. van Diest,et al.  The molecular genetics and morphometry‐based endometrial intraepithelial neoplasia classification system predicts disease progression in endometrial hyperplasia more accurately than the 1994 World Health Organization classification system , 2005, Cancer.

[12]  George Lee,et al.  Image analysis and machine learning in digital pathology: Challenges and opportunities , 2016, Medical Image Anal..

[13]  P. V. van Diest,et al.  Use of computerized morphometric analyses of endometrial hyperplasias in the prediction of coexistent cancer. , 1996, American journal of obstetrics and gynecology.

[14]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[15]  Arvydas Laurinavicius,et al.  Digital Image Analysis in Pathology: Benefits and Obligation , 2011, Analytical cellular pathology.

[16]  R M Richart,et al.  Endometrial precancer diagnosis by histopathology, clonal analysis, and computerized morphometry , 2000, The Journal of pathology.

[17]  G. Mutter,et al.  Endometrial intraepithelial neoplasia (EIN): will it bring order to chaos? The Endometrial Collaborative Group. , 2000, Gynecologic oncology.

[18]  Marilyn M Bui,et al.  Image Analysis in Surgical Pathology. , 2016, Surgical pathology clinics.