User friendly, cloud based, whole slide image segmentation

Convolutional neural networks, the state of the art for image segmentation, have been successfully applied to histology images by many computa�onal researchers. However, the translatability of this technology to clinicians and biological researchers is limited due to the complex and undeveloped user interface of the code, as well as the extensive computer setup required. We have developed a tool for segmenta�on of whole slide images (WSIs) with an easy to use graphical user interface. Our tool runs a state-of-the-art convolu�onal neural network for segmentation of WSIs in the cloud. Our plugin is built on the open source tool HistomicsTK by Kitware Inc. (Clifton Park, NY), which provides remote data management and viewing abili�es for WSI datasets. The ability to access this tool over the internet will facilitate widespread use by computa�onal non-experts. Users can easily upload slides to a server where our plugin is installed and perform human in the loop segmenta�on analysis remotely. This tool is open source, and has the ability to be adapted to segment of any pathological structure. For a proof of concept, we have trained it to segment glomeruli from renal �ssue images, achieving an F-score > 0.97 on holdout �ssue slides. INTRODUCTION Recent advancements in machine learning techniques have a�ained previously unachievable accuracy for image analysis tasks. In par�cular, convolu�onal neural networks (CNNs)1 a form of deep learning2, have great poten�al for impac�ul applica�ons for the segmenta�on of image structures. In the field of pathology, CNNs have been successfully u�lized by many research groups for the segmenta�on of whole slide images (WSIs)3-6. However, thus far tools to segment WSIs have been complex to deploy and use, requiring use of the command line interface and computational exper�se7-9. Going beyond development, the target demographic for these tools is the pathologist or biological scientist, whose clinical workflow or research questions could leverage fast and accurate segmentation of relevant structures. To address this gap, we have developed a powerful tool for the segmentation ofWSIs and deployed it as an easy to use plugin in a cloud based WSI viewer. Upon the comple�on of our code, it will be open-sourced and easily deployable on a remote server for use by the community over the web. Addi�onally, we will host an instance of our tool which is publicly available for the community, for the sake of security, this instance will only be accessible after approval of a user’s account. This tool is an extension our out previous work H-AI-L6, where we showed that iterative annotation of WSIs significantly reduced the annotation burden. Like most works in computa�onal digital pathology, H-AI-L found limited use in the community due to the complexi�es of installation. To address this, our new segmentation tool does not require the installa�on of any so�ware on the user’s local computer, and all the processing is handled on the remote server which is hos�ng the web client. It produces computational annotations which are automa�cally displayed on top of the slide using the HistomicsTK interface10 as seen in. The user can pan and zoom the slide as well as interact with the annotations, removing, correc�ng, or adding regions. See Figs. 1 and 2 for details about this graphical user interface. While our segmenta�on plugin is agnos�c to �ssue type or structure of interest, we have validated it by training a CNN model for the segmentation of glomeruli from renal tissue images. However, in an effort to make this tool more useful to the digital pathology community, we have also created a plugin for training new segmentation models. Using our simple cloud-based interface users can upload and annotate WSIs, and train a segmenta�on network using their annotations, see Fig. 2. Like in H-AI-L6, users can iteratively use the training and predic�on plugins of our segmentation tool in an ac�ve learning framework, to build up powerful segmenta�on models with minimal effort. RESULTS & DISCUSSION To access the poten�al of our segmenta�on tool we used a network model for glomeruli segmenta�on (trained using 768 WSIs) to segment 100 holdout WSIs of diverse stain, ins�tution, scanner, and species. The 100 holdout slides included 3816 glomeruli, 37.8 GB of compressed image data, and a combined total of more than 0.24 trillion image pixels. We compared the computa�onally generated annota�ons with hand annotations for glomeruli and observed the following performance (Fig. 3B): F-score=0.97 | MCC=0.97 | Kappa=0.97 | IOU=0.941 | Sensit ivity=0.953 | Specif icity=1.0 | Precision=0.988 | Accuracy=1.0 To our knowledge, our study of glomeruli segmentation not only uses the largest most diverse cohort of slides, but also reports the best performance of any study reported in the literature. In our previous work H-AI-L6 we trained Deeplabv211 using a dataset of 13 PAS and H&E stained mouse WSIs containing 913 glomeruli, and achieved an F-score=0.92. Kannan et al.12 (who used Incep�on-V313 for the slidingwindow classification of glomeruli with a training set of 885 patches from 275 trichrome stained biopsy’s) reportMCC = 0.628. Bueno et al.14 trained U-net5 with 47 PAS stained WSIs reported accuracy=0.98. Gadermayr et al.15 used 24 PAS stained mouse WSIs to train U-net5, repor�ng precision=0.97 and sensi�vity=0.86. Of all theworks in the literature, Jayapandian el al.16 present themost comprehensive results on glomerular segmentation. They trained U-net5 on a dataset containing 1196 glomeruli from 459 human WSIs stained with H&E, PAS, Silver and Trichrome, reporting an F-score of 0.94. However their analysis is limited to data withminimal change disease17, which as the name describes, presents pathologically as normal glomeruli. In contrast our training dataset contained 768 WSIs, stained with H&E, PAS, Silver, Trichrome, Toluene Blue, CD-68, Verhoeff’s Van Gieson, Jones, and Congo red. In total this dataset contains 61734 glomeruli, from nearly 50 disease pathologies with both human and mouse data. Our holdout dataset was split at the slide level, and contains mouse and human data from different institu�ons, scanners, and stains, with mul�ple disease pathologies present. Examples of holdout glomeruli are shown in Fig. 4. Our segmentation tool works na�vely on WSIs without the need for patch extrac�on prior to training / predic�on. It produces segmentations which contain the contours of detected regions for the whole WSI. When developing new segmentation models, the slide-viewing environment of our tool, enables rapid qualita�ve evaluation of algorithm progress by displaying network predictions directly on the slides as a series of annotation contours (Fig. 1). With the ability to correct computa�onally generated annotations on holdout slides, it is easy to add corrected slides to the training dataset (Fig. 2). Our tool uses hardware accelera�on on the host server to speed up processing, and is capable of segmen�ng large histology slides in as li�le as 1 min. The segmentation time depends (roughly linearly) on the size of the �ssue section in the slide, Fig. 3A quan�fies the detec�on speed as a function of image pixels on a large cohort of 1591 WSIs. Our algorithm performs a fast thresholding of the �ssue region contained within the slide to reduce the computational burden on slides with large non-�ssue areas, there is a slight programma�cal computa�onal overhead when opening and caching larger slides – seen as gentle upslope of points of the same color in Fig 3A. We have found that using this tool to alternate iteratively between training and predic�on greatly reduces the annota�on burden, allowing experts to correct the network predic�ons on holdout WSIs before incorpora�ng them into the trainingset6. When selec�ng new data to add to the training-set, we found the ability to view predictions interactively on the WSI is extremely helpful to determine slides where the current model struggles. Prac�cally, we have found that the performance characteris�cs of our tool (a very high specificity in glomerular segmenta�on – Fig. 3) are very favorable when deployed in a human in the loop se�ng. This enables the user to add missing annotations, without needing to remove many false positive predic�ons. Indeed, when building up our glomerular training-set, we found that the network quickly began to iden�fy whole glomeruli, missing those with severe disease pathology and abnormal staining. Annotations done directly on the WSI in an interac�ve viewing environment easily fits into pathologist workflow, and the cloud-based nature of our tool abstracts any computa�onal overhead away from the end user. Annotation can be done on any internet connected device without any software installa�on including a mobile phone. If the user prefers to annotate locally, we have added op�ons to ingest and export annotations in a XML18 format readable by the commonly used WSI viewer Aperio Imagescope19. The authors note a complimentary work: Quick Annotator20, a recently published work which speeds the annotation of histology slides using the locally installed QuPath slide viewer21. This tool uses superpixels22 anddeep learning to speed the segmenta�on of local regions of interestwithin aWSI. In the futurewewould like to u�lize a similar approach combined with edge detec�on and snapping23 to speed the ini�al segmentation byhuman annotators. We also note that at the time of writing this preprint we are also working to incorporate our core segmentation code into QuPath. A video overview of a beta version of the tool available here: https://buffalo.box.com/s/w9ao5p1qs9o3lgyqk8ioih6grklb9p9i. METHODS With the goal of developing a tool with class leading WSI segmenta�on accuracy as well as easy accessibility to computational non-experts, we have integrated the popular seman�c segmenta�on network Deeplab V3+24 with Digital Slide Archive10 the open-source cloud-based his

[1]  Sanghoon Lee,et al.  The Digital Slide Archive: A Software Platform for Management, Integration, and Analysis of Histology for Cancer Research. , 2017, Cancer research.

[2]  Harry Shum,et al.  Lazy snapping , 2004, ACM Trans. Graph..

[3]  Vijaya B. Kolachalama,et al.  Segmentation of Glomeruli Within Trichrome Images Using Deep Learning , 2018, bioRxiv.

[4]  Rabi Yacoub,et al.  Unsupervised labeling of glomerular boundaries using Gabor filters and statistical testing in renal histology , 2017, Journal of medical imaging.

[5]  V. D’Agati,et al.  Adult minimal-change disease: clinical characteristics, treatment, and outcomes. , 2007, Clinical journal of the American Society of Nephrology : CJASN.

[6]  Rabi Yacoub,et al.  Computational Segmentation and Classification of Diabetic Glomerulosclerosis. , 2019, Journal of the American Society of Nephrology : JASN.

[7]  Yu Zhou,et al.  Quick Annotator: an open‐source digital pathology based rapid image annotation tool , 2021, The journal of pathology. Clinical research.

[8]  Allen H. Olson 1 Image Analysis Using the Aperio ScanScope , 2005 .

[9]  Orcun Goksel,et al.  Reducing annotation effort in digital pathology: A Co-Representation learning framework for classification tasks , 2020, Medical Image Anal..

[10]  IEEE conference on computer vision and pattern recognition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[11]  Christine Decaestecker,et al.  Strategies to Reduce the Expert Supervision Required for Deep Learning-Based Segmentation of Histopathological Images , 2019, Front. Med..

[12]  Peter Bankhead,et al.  QuPath: Open source software for digital pathology image analysis , 2017 .

[13]  Madhu S. Nair,et al.  Batch Mode Active Learning on the Riemannian Manifold for Automated Scoring of Nuclear Pleomorphism in Breast Cancer , 2020, Artif. Intell. Medicine.

[14]  Oscar Déniz-Suárez,et al.  Glomerulosclerosis identification in whole slide images using semantic segmentation , 2019, Comput. Methods Programs Biomed..

[15]  Pinaki Sarder,et al.  Artificial intelligence driven next-generation renal histomorphometry. , 2020, Current opinion in nephrology and hypertension.

[16]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  John E. Tomaszewski,et al.  An integrated iterative annotation technique for easing neural network training in medical image analysis , 2019, Nat. Mach. Intell..

[19]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..