论文信息 - Semantic information extraction from Tamil documents

Semantic information extraction from Tamil documents

Semantic information extraction is a process of extracting concepts, entities, relations and entailment rules from a document. We propose an approach to extract concepts, entities and relations from the domain specific Tamil textual documents corpus. Concepts are extracted out by identifying semantic relationship between unique terms in documents by lexical database namely WordNet. Entity extraction process identifies proper names and its predefined named entity categories. Relation between two entities or between two concepts or between concept and entity in each sentence are extracted. This extracted semantic knowledge is exploited as a resource for extracting semantic information from the domain-related documents.

T. V. Geetha | S. Lakshmana Pandian | J. Devakumar

[1] William F. Punch,et al. Automated Concept Extraction From Plain Text , 1998 .

[2] Atanas Kiryakov,et al. Towards Semantic Web Information Extraction , 2003 .

[3] Ralph Grishman,et al. Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[4] Paul Buitelaar,et al. RelExt: A Tool for Relation Extraction from Text in Ontology Extension , 2005, SEMWEB.

[5] Beáta Megyesi,et al. Shallow Parsing with PoS Taggers and Linguistic Features , 2002, J. Mach. Learn. Res..

[6] William F. Punch,et al. Discovering Concepts in Raw Text: Building Semantic Relationship Graphs , 2007 .

[7] Vaishnavi Ramaswamy,et al. A morphological analyzer for Tamil , 2003 .

[8] Ah-Hwee Tan,et al. Mining semantic networks for knowledge discovery , 2003, Third IEEE International Conference on Data Mining.

[9] Jae-Ho Kim,et al. Reasoning relation among RDF/RDFS resources using PROLOG rules and facts , 2004, IEEE Conference on Cybernetics and Intelligent Systems, 2004..

[10] Mark Stevenson,et al. A Semantic Approach to IE Pattern Induction , 2005, ACL.

[11] Ido Dagan,et al. Investigating a Generic Paraphrase-Based Approach for Relation Extraction , 2006, EACL.