论文信息 - Identification and classification of relations for Indian languages using machine learning approaches for developing a domain specific ontology

Identification and classification of relations for Indian languages using machine learning approaches for developing a domain specific ontology

Information extraction and classification using Natural Language Processing techniques of layered architecture such as pre-processing task, processing of semantic analysis etc., helps in implementing further deeper evaluation techniques for the accuracy of natural language based electronic database. This paper explores relational information extraction of multilingual IndoWordNet database matching with domain specific terms. Further, extracted information are processed through conventional statistical methods, Normalized Web Distance (NWD) similarity method and two other machine learning evaluation techniques such as Support Vector Machine (SVM), Neural Network (NN) to compare with their accuracy. Results of machine leaning based techniques outperform with significant improved accuracy over conventional methods. The objective of using these techniques along with semantic web technology is to initiate a proof of concept for ontology generation by identification and classification of extracted relational information from IndoWordNet. This paper also highlights domain specific challenges and issues in developing relational model of ontology.

Megha Garg | Bhaskar Sinha | Somnath Chandra

[1] Ming Li,et al. Normalized Information Distance , 2008, ArXiv.

[2] Dong-Hong Ji,et al. Unsupervised Feature Selection for Relation Extraction , 2005, IJCNLP.

[3] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[4] Mohammed Bennamoun,et al. Ontology learning from text: A look back and into the future , 2012, CSUR.

[5] Takashi Chikayama,et al. Simple Customization of Recursive Neural Networks for Semantic Relation Classification , 2013, EMNLP.

[6] Chih-Jen Lin,et al. A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[7] Luke S. Zettlemoyer,et al. Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[8] Éric Gaussier,et al. A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation , 2005, ECIR.

[9] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[10] Paul Buitelaar,et al. Ontology Learning from Text: An Overview , 2005 .

[11] Daniel Jurafsky,et al. Distant supervision for relation extraction without labeled data , 2009, ACL.