Polytomous diagnosis of ovarian tumors as benign, borderline, primary invasive or metastatic: development and validation of standard and kernel-based risk prediction models

BackgroundHitherto, risk prediction models for preoperative ultrasound-based diagnosis of ovarian tumors were dichotomous (benign versus malignant). We develop and validate polytomous models (models that predict more than two events) to diagnose ovarian tumors as benign, borderline, primary invasive or metastatic invasive. The main focus is on how different types of models perform and compare.MethodsA multi-center dataset containing 1066 women was used for model development and internal validation, whilst another multi-center dataset of 1938 women was used for temporal and external validation. Models were based on standard logistic regression and on penalized kernel-based algorithms (least squares support vector machines and kernel logistic regression). We used true polytomous models as well as combinations of dichotomous models based on the 'pairwise coupling' technique to produce polytomous risk estimates. Careful variable selection was performed, based largely on cross-validated c-index estimates. Model performance was assessed with the dichotomous c-index (i.e. the area under the ROC curve) and a polytomous extension, and with calibration graphs.ResultsFor all models, between 9 and 11 predictors were selected. Internal validation was successful with polytomous c-indexes between 0.64 and 0.69. For the best model dichotomous c-indexes were between 0.73 (primary invasive vs metastatic) and 0.96 (borderline vs metastatic). On temporal and external validation, overall discrimination performance was good with polytomous c-indexes between 0.57 and 0.64. However, discrimination between primary and metastatic invasive tumors decreased to near random levels. Standard logistic regression performed well in comparison with advanced algorithms, and combining dichotomous models performed well in comparison with true polytomous models. The best model was a combination of dichotomous logistic regression models. This model is available online.ConclusionsWe have developed models that successfully discriminate between benign, borderline, and invasive ovarian tumors. Methodologically, the combination of dichotomous models was an interesting approach to tackle the polytomous problem. Standard logistic regression models were not outperformed by regularized kernel-based alternatives, a finding to which the careful variable selection procedure will have contributed. The random discrimination between primary and metastatic invasive tumors on temporal/external validation demonstrated once more the necessity of validation studies.

[1]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .

[2]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: validating a prognostic model , 2009, BMJ : British Medical Journal.

[3]  W P Collins,et al.  Re: Mol et al. Distinguishing the benign and malignant adnexal mass: an external validation of prognostic models. Gynecol Oncol 2001;80:162-7. , 2001, Gynecologic oncology.

[4]  Sabine Van Huffel,et al.  Inclusion of CA-125 does not improve mathematical models developed to distinguish between benign and malignant adnexal tumors. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[5]  J. Habbema,et al.  Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. , 2000, Statistics in medicine.

[6]  Ewout W. Steyerberg,et al.  Application of Shrinkage Techniques in Logistic Regression Analysis: A Case Study , 2001 .

[7]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[8]  S Van Huffel,et al.  Ovarian cancer prediction in adnexal masses using ultrasound‐based logistic regression models: a temporal and external validation study by the IOTA group , 2010, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[9]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[10]  P Maisonneuve,et al.  Carcinoma of the ovary. FIGO 26th Annual Report on the Results of Treatment in Gynecological Cancer. , 2006, International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics.

[11]  References , 1971 .

[12]  Sabine Van Huffel,et al.  Using Bayesian neural networks with ARD input selection to detect malignant ovarian masses prior to surgery , 2008, Neural Computing and Applications.

[13]  B. Mol,et al.  Distinguishing the benign and malignant adnexal mass: an external validation of prognostic models. , 2001, Gynecologic oncology.

[14]  Il-Seok Oh,et al.  Binary classification trees for multi-class classification problems , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[15]  J. Cnossen,et al.  The Accuracy of Risk Scores in Predicting Ovarian Malignancy: A Systematic Review , 2009, Obstetrics and gynecology.

[16]  D. Mossman Three-way ROCs , 1999, Medical decision making : an international journal of the Society for Medical Decision Making.

[17]  Joos Vandewalle,et al.  Prognostic importance of degree of differentiation and cyst rupture in stage I invasive epithelial ovarian carcinoma , 2001, The Lancet.

[18]  Sabine Van Huffel,et al.  Multi-class AUC metrics and weighted alternatives , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[19]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[20]  Ludovico Muzii,et al.  Minilaparotomy versus laparoscopy in the treatment of benign adnexal cysts: a randomized clinical study. , 2007, European journal of obstetrics, gynecology, and reproductive biology.

[21]  J. Wyatt,et al.  Commentary: Prognostic models: clinically useful or quickly forgotten? , 1995 .

[22]  J. Habbema,et al.  Prognostic Modeling with Logistic Regression Analysis , 2001, Medical decision making : an international journal of the Society for Medical Decision Making.

[23]  S. Orsulic,et al.  Ovarian Cancer , 1993, British Journal of Cancer.

[24]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[25]  David R. Anderson,et al.  Model Selection and Inference: A Practical Information-Theoretic Approach , 2001 .

[26]  Hys Ngan,et al.  Carcinoma of the Ovary , 2003, International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics.

[27]  L. Tanoue Cancer Statistics, 2009 , 2010 .

[28]  Johan A. K. Suykens,et al.  Multi-class kernel logistic regression: a fixed-size implementation , 2007, IJCNN.

[29]  K. Johana,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2022 .

[30]  Ewout W Steyerberg,et al.  Polytomous regression did not outperform dichotomous logistic regression in diagnosing serious bacterial infections in febrile children. , 2008, Journal of clinical epidemiology.

[31]  Sabine Van Huffel,et al.  External Validation of Mathematical Models to Distinguish Between Benign and Malignant Adnexal Tumors: A Multicenter Study by the International Ovarian Tumor Analysis Group , 2007, Clinical Cancer Research.

[32]  Antonio Malvasi,et al.  Conservative surgery for borderline ovarian tumors: a review. , 2006, Gynecologic oncology.

[33]  Allan Donner,et al.  Efficiency of reduced logistic regression models , 1994 .

[34]  E W Steyerberg,et al.  Polytomous logistic regression analysis could be applied more often in diagnostic research. , 2008, Journal of clinical epidemiology.

[35]  Johan A. K. Suykens,et al.  Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis , 2002, Neural Computation.

[36]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[37]  J. Suykens,et al.  Preoperative diagnosis of ovarian tumors using Bayesian kernel‐based methods , 2007, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[38]  Takahiro Suzuki,et al.  Long-term prognosis of stage I ovarian carcinoma. Prognostic importance of intraoperative rupture. , 2003, Oncology.

[39]  Johan A. K. Suykens,et al.  Low rank updated LS-SVM classifiers for fast variable selection , 2008, Neural Networks.

[40]  T. Bourne,et al.  Logistic regression model to distinguish between the benign and malignant adnexal mass before surgery: a multicenter study by the International Ovarian Tumor Analysis Group. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[41]  B LOGAN,et al.  Obstetrics and Gynecology , 1917, Edinburgh Medical Journal.

[42]  Sabine Van Huffel,et al.  Multi-class classification of ovarian tumors , 2008, ESANN.

[43]  T. Bourne,et al.  Terms, definitions and measurements to describe the sonographic features of adnexal tumors: a consensus opinion from the International Ovarian Tumor Analysis (IOTA) group , 2000, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[44]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .