Adaptive risk prediction system with incremental and transfer learning

Currently, popular methods for prenatal risk assessment of fetal aneuploidies are based on multivariate probabilistic modelling, that are built on decades of scientific research and large-scale multi-center clinical studies. These static models that are deployed to screening labs are rarely updated or adapted to local population characteristics. In this article, we propose an adaptive risk prediction system or ARPS, which considers these changing characteristics and automatically deploys updated risk models. 8 years of real-life Down syndrome screening data was used to firstly develop a distribution shift detection method that captures significant changes in the patient population and secondly a probabilistic risk modelling system that adapts to new data when these changes are detected. Various candidate systems that utilize transfer -and incremental learning that implement different levels of plasticity were tested. Distribution shift detection using a windowed approach provides a computationally less expensive alternative to fitting models at every data block step while not sacrificing performance. This was possible when utilizing transfer learning. Deploying an ARPS to a lab requires careful consideration of the parameters regarding the distribution shift detection and model updating, as they are affected by lab throughput and the incidence of the screened rare disorder. When this is done, ARPS could be also utilized for other population screening problems. We demonstrate with a large real-life dataset that our best performing novel Incremental-Learning-Population-to-Population-Transfer-Learning design can achieve on par prediction performance without human intervention, when compared to a deployed risk screening algorithm that has been manually updated over several years.

[1]  P. Royston,et al.  Model-based screening by risk with application to Down's syndrome. , 1992, Statistics in medicine.

[2]  Tapio Pahikkala,et al.  Synthetic minority oversampling of vital statistics data with generative adversarial networks , 2020, J. Am. Medical Informatics Assoc..

[3]  Flora D. Salim,et al.  DA-HOC: semi-supervised domain adaptation for room occupancy prediction using CO2 sensor data , 2017, BuildSys@SenSys.

[4]  Ribana Roscher,et al.  Explainable Machine Learning for Scientific Insights and Discoveries , 2019, IEEE Access.

[5]  Christopher D. Manning,et al.  Hierarchical Bayesian Domain Adaptation , 2009, NAACL.

[6]  Lori E. Dodd,et al.  Partial AUC Estimation and Regression , 2003, Biometrics.

[7]  R. Leach,et al.  The worldwide obesity epidemic. , 2001, Obesity research.

[8]  T K Lau,et al.  Fetal crown–rump length and estimation of gestational age in an ethnic Chinese population , 2009, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[9]  Stevo Bozinovski,et al.  Reminder of the First Paper on Transfer Learning in Neural Networks, 1976 , 2020, Informatica.

[10]  Edward Y. Chang,et al.  Transfer representation learning for medical image analysis , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[11]  Yi Zhang,et al.  Average Precision , 2009, Encyclopedia of Database Systems.

[12]  Aki Koivu,et al.  Predicting risk of stillbirth and preterm pregnancies with machine learning , 2020, Health Information Science and Systems.

[13]  J. D. de Winter,et al.  Clinical practice , 2010, European Journal of Pediatrics.

[14]  M. Varner,et al.  Population-based trends and correlates of maternal overweight and obesity, Utah 1991-2001. , 2005, American journal of obstetrics and gynecology.

[15]  K. Kagan,et al.  A mixture model of nuchal translucency thickness in screening for chromosomal defects , 2008, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[16]  R. Kirby,et al.  National population-based estimates for major birth defects, 2010-2014. , 2019, Birth defects research.

[17]  Alexander J. Smola,et al.  Efficient mini-batch training for stochastic optimization , 2014, KDD.

[18]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[19]  Sayan Mukherjee,et al.  Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization , 2006, Adv. Comput. Math..

[20]  D. Sahota,et al.  Medians and correction factors for biochemical and ultrasound markers in Chinese women undergoing first‐trimester screening for trisomy 21 , 2009, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[21]  Alexander Turchin,et al.  Analysis of Data Errors in Clinical Research Databases , 2008, AMIA.

[22]  M. Stephens EDF Statistics for Goodness of Fit and Some Comparisons , 1974 .

[23]  Tapio Pahikkala,et al.  Evaluation of machine learning algorithms for improved risk assessment for Down's syndrome , 2018, Comput. Biol. Medicine.

[24]  F. Dunstan,et al.  All MoMs are not equal: some statistical properties associated with reporting results in the form of multiples of the median. , 1993, American journal of human genetics.

[25]  Xu Li,et al.  Detection of trisomies 13, 18 and 21 using non-invasive prenatal testing , 2017, Experimental and therapeutic medicine.

[26]  Lorenzo Bruzzone,et al.  An incremental-learning neural network for the classification of remote-sensing images , 1999, Pattern Recognit. Lett..

[27]  Francisco Herrera,et al.  A unifying view on dataset shift in classification , 2012, Pattern Recognit..

[28]  Carol Bennett,et al.  Implementation of clinical decision rules in the emergency department. , 2007, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[29]  D. Sahota,et al.  Prospective assessment of the Hong Kong Hospital Authority universal Down syndrome screening programme. , 2013, Hong Kong medical journal = Xianggang yi xue za zhi.

[30]  Victor S. Sheng,et al.  Cost-Sensitive Learning and the Class Imbalance Problem , 2008 .

[31]  D. Wright,et al.  First‐trimester screening for trisomy 21 by free beta‐human chorionic gonadotropin and pregnancy‐associated plasma protein‐A: impact of maternal and pregnancy characteristics , 2008, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.