This research aims to collect the extracted HerbalMedicinalProperty relations from downloaded herbal-plant documents for creating the herbal-medicinal-property-network based representation. An HerbalMedicinalProperty relation is a semantic relation between one herbal-plant-component concept and several herbal-medicinal-property-concept expressions on texts and vice versa. An herbal-plant-component occurrence is a noun-phrase expression and each herbal-medicinal-property- concept occurrence is an event expression by a verb-phrase of EDU (an Elementary Discourse Unit or a simple sentence). The herbal-medicinal-property-network based representation benefits a recommendation system of solving health-problems on web-boards. The research has two main problems: 1) how to extract HerbalMedicinalProperty relations from the documents, and 2) how to collect the HerbalMedicinalProperty relations for creating the herbal-medicinal-property-network based representation. Therefore, we propose applying a co-occurrence of N-Words (or N-Word-Co) including N-Word-Co size learning on the verb phrase to identify several medicinal-property-concept EDU occurrences over the documents after the linguistic phenomena has been applied to solve the herbal-plant-component concepts. The extracted HerbalMedicinalProperty relations are then collected as a matrix of herbal-plant names, herbal-plant components, and herbal-medicinal properties for creating the herbal-medicinal-property-network based representation. The research results provide the high precision of the HerbalMedicinalProperty-relation extraction from the documents.
[1]
Daniel Marcu,et al.
Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory
,
2001,
SIGDIAL Workshop.
[2]
Mehwish Riaz,et al.
Recognizing Causality in Verb-Noun Pairs via Noun and Verb Semantics
,
2014,
EACL 2014.
[3]
Asanee Kawtrakul,et al.
Thai Named Entity Extraction by incorporating Maximum Entropy Model with Simple Heuristic Information
,
2004
.
[4]
Hsin-Hsi Chen,et al.
TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining
,
2008,
BMC complementary and alternative medicine.
[5]
Choochart Haruechaiyasak,et al.
ThaiHerbMiner: A Thai Herbal Medicine Mining and Visualizing Tool
,
2011,
BioNLP@ACL.
[6]
Adam L. Berger,et al.
A Maximum Entropy Approach to Natural Language Processing
,
1996,
CL.
[7]
Mark Stevenson,et al.
University_Of_Sheffield: Two Approaches to Semantic Text Similarity
,
2012,
*SEMEVAL.
[8]
Marius Pasca,et al.
Turning Web Text and Search Queries into Factual Knowledge: Hierarchical Class Attribute Extraction
,
2008,
AAAI.
[9]
David Bawden,et al.
The History and Heritage of Scientific and Technological Information Systems
,
2006,
J. Documentation.
[10]
G. Meade.
Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory
,
2001
.
[11]
Diego Reforgiato Recupero,et al.
Extracting knowledge from text using SHELDON, a Semantic Holistic framEwork for LinkeD ONtology data
,
2015,
WWW.