Using A Natural Language Processing System to Extract and Code Family History Data from Admission Reports

We developed a rule-based natural language processing (NLP) system for extracting and coding clinical data from free text reports. We studied the systems ability to accurately extract and code family history data from hospital admission notes. The system searches the family history for 12 diseases (and relative degree). It achieved a sensitivity of .96 and a PPV of .97 for disease extraction, and .96 and .93 respectively for relative categorization.