A finite-state automata based negation detection algorithm for Chinese clinical documents

In this paper we described an algorithm called NegDetector for locating concerned clinical terms mentioned in electronic narrative text clinical documents and detecting whether the particular terms appeared in different positions are negated or affirmed. The algorithm infers the status of a condition with regard to the property from simple lexical clues occurring in the context of condition, maybe more than a few words away from the term. Considering the diverse types of negative structures, this paper selects typical, common and recognizable usage patterns of negatives as criteria of judgment. The judging results during one complete process are driven by many different types of symbols, and the response to a particular symbol depends on the sequence of previous judging results. In this situation, the finite-state automata is useful to address lots of symbols that trigger one another. When evaluating NegDetector with testing case history, we measured a recall of 0.9985, a precision of 0.9498 and a fallout of 0.5147.