Learning from Subjective Ratings Using Auto-Decoded Deep Latent Embeddings

Depending on the application, radiological diagnoses can be associated with high interand intra-rater variabilities. Most computer-aided diagnosis (CAD) solutions treat such data as incontrovertible, exposing learning algorithms to considerable and possibly contradictory label noise and biases. Thus, managing subjectivity in labels is a fundamental problem in medical imaging analysis. To address this challenge, we introduce auto-decoded deep latent embeddings (ADDLE), which explicitly models the tendencies of each rater using an autodecoder framework. After a simple linear transformation, the latent variables can be injected into any backbone at any and multiple points, allowing the model to account for rater-specific effects on the diagnosis. Importantly, ADDLE does not expect multiple raters per image in training, meaning it can readily learn from data mined from hospital archives. Moreover, the complexity of training ADDLE does not increase as more raters are added. During inference each rater can be simulated and a “mean” or “greedy” virtual rating can be produced. We test ADDLE on the problem of liver steatosis diagnosis from 2D ultrasound (US) by collecting 36 602 studies along with clinical US diagnoses originating from 65 different raters. We evaluated diagnostic performance using a separate dataset with gold-standard biopsy diagnoses. ADDLE can improve the partial areas under the curve (AUCs) for diagnosing severe steatosis by 10.5% over standard classifiers while outperforming other annotator-noise approaches, including those requiring 65 times the parameters.

[1]  A. R. Jonckheere,et al.  A DISTRIBUTION-FREE k-SAMPLE TEST AGAINST ORDERED ALTERNATIVES , 1954 .

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[4]  Le Lu,et al.  A scalable physician-level deep learning algorithm detects universal trauma on pelvic radiographs , 2021, Nature Communications.

[5]  Jasjit S. Suri,et al.  Symtosis: A liver ultrasound tissue characterization and risk stratification in optimized deep learning paradigm , 2018, Comput. Methods Programs Biomed..

[6]  G. Moore,et al.  Anomalous collapses of Nares Strait ice arches leads to enhanced export of Arctic sea ice , 2021, Nature communications.

[7]  Anima Anandkumar,et al.  Learning From Noisy Singly-labeled Data , 2017, ICLR.

[8]  Mingyue Ding,et al.  Deep learning based classification of focal liver lesions with contrast-enhanced ultrasound , 2014 .

[9]  L. Henry,et al.  NAFLD AND NASH: Global burden of NAFLD and NASH: trends, predictions, risk factors and prevention , 2018 .

[10]  Bowen Li,et al.  Reliable Liver Fibrosis Assessment from Ultrasound using Global Hetero-Image Fusion and View-Specific Parameterization , 2020, MICCAI.

[11]  Chi-Chun Lee,et al.  Every Rating Matters: Joint Learning of Subjective Labels and Individual Annotators for Speech Emotion Classification , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[13]  A. Lyshchik,et al.  Automated Machine Learning in the Sonographic Diagnosis of Non-alcoholic Fatty Liver Disease , 2020, ADVANCED ULTRASOUND IN DIAGNOSIS AND THERAPY.

[14]  Eibe Frank,et al.  A Simple Approach to Ordinal Classification , 2001, ECML.

[15]  William M. Wells,et al.  Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , 2004, IEEE Transactions on Medical Imaging.

[16]  Eyke Hüllermeier,et al.  Binary Decomposition Methods for Multipartite Ranking , 2009, ECML/PKDD.

[17]  Cezary Szmigielski,et al.  Transfer learning with deep convolutional neural network for liver steatosis assessment in ultrasound images , 2018, International Journal of Computer Assisted Radiology and Surgery.

[18]  Geoffrey E. Hinton,et al.  Who Said What: Modeling Individual Labelers Improves Classification , 2017, AAAI.

[19]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Shuang Yu,et al.  Difficulty-aware Glaucoma Classification with Multi-Rater Consensus Modeling , 2020, MICCAI.

[21]  Pachamuthu Rajalakshmi,et al.  A Novel Computer-Aided Diagnosis Framework Using Deep Learning for Classification of Fatty Liver Disease in Ultrasound Imaging , 2018, 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom).

[22]  Swami Sankaranarayanan,et al.  Learning From Noisy Labels by Regularized Estimation of Annotator Confusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[24]  Kenji Suzuki,et al.  Overview of deep learning in medical imaging , 2017, Radiological Physics and Technology.

[25]  M. Lungren,et al.  Preparing Medical Imaging Data for Machine Learning. , 2020, Radiology.

[26]  Eliseo Guallar,et al.  Diagnostic accuracy and reliability of ultrasonography for the detection of fatty liver: A meta‐analysis , 2011, Hepatology.