A multisite validation of whole slide imaging for primary diagnosis using standardized data collection and analysis

Context: Text-based reporting and manual arbitration for whole slide imaging (WSI) validation studies are labor intensive and do not allow for consistent, scalable, and repeatable data collection or analysis. Objective: The objective of this study was to establish a method of data capture and analysis using standardized codified checklists and predetermined synoptic discordance tables and to use these methods in a pilot multisite validation study. Methods and Study Design: Fifteen case report form checklists were generated from the College of American Pathology cancer protocols. Prior to data collection, all hypothetical pairwise comparisons were generated, and a level of harm was determined for each possible discordance. Four sites with four pathologists each generated 264 independent reads of 33 cases. Preestablished discordance tables were applied to determine site by site and pooled accuracy, intrareader/intramodality, and interreader intramodality error rates. Results: Over 10,000 hypothetical pairwise comparisons were evaluated and assigned harm in discordance tables. The average difference in error rates between WSI and glass, as compared to ground truth, was 0.75% with a lower bound of 3.23% (95% confidence interval). Major discordances occurred on challenging cases, regardless of modality. The average inter-reader agreement across sites for glass was 76.5% (weighted kappa of 0.68) and for digital it was 79.1% (weighted kappa of 0.72). Conclusion: These results demonstrate the feasibility and utility of employing standardized synoptic checklists and predetermined discordance tables to gather consistent, comprehensive diagnostic data for WSI validation studies. This method of data capture and analysis can be applied in large-scale multisite WSI validations.