Automated mutual exclusion rules discovery for structured observational codes in echocardiography reporting

Structured reporting in medicine has been argued to support and enhance machine-assisted processing and communication of pertinent information. Retrospective studies showed that structured echocardiography reports, constructed through point-and-click selection of finding codes (FCs), contain pair-wise contradictory FCs (e.g., "No tricuspid regurgitation" and "Severe regurgitation") downgrading report quality and reliability thereof. In a prospective study, contradictions were detected automatically using an extensive rule set that encodes mutual exclusion patterns between FCs. Rules creation is a labor and knowledge-intensive task that could benefit from automation. We propose a machine-learning approach to discover mutual exclusion rules in a corpus of 101,211 structured echocardiography reports through semantic and statistical analysis. Ground truth is derived from the extensive prospectively evaluated rule set. On the unseen test set, F-measure (0.439) and above-chance level AUC (0.885) show that our approach can potentially support the manual rules creation process. Our methods discovered previously unknown rules per expert review.