Relation extraction from Traditional Chinese Medicine journal publication

This modern day, the amount of digital text documents is enormous and cover almost all fields and industry. Natural Language Processing (NLP) looks into systematically deriving information from text written in natural language. A task under NLP, Relation Extraction (ER) focus on identifying relations from natural text. It has found significant application on biomedical publications, where it has been used to identify protein-to-protein interaction and gene-to-disease relationships in biomedical publications. Such application is also effective on Traditional Chinese Medicine (TCM) publications. This research identifies two forms of relations in TCM publications: Effect Relation and Conditional Effect Relation. This research introduces and compares two extraction approaches, in which also address some of the more Chinese-specific NLP problems, such as word segmentation and flexible syntactic structure.