论文信息 - UNITN: Part-Of-Speech Counting in Relation Extraction

UNITN: Part-Of-Speech Counting in Relation Extraction

This report describes the UNITN system, a Part-Of-Speech Context Counter, that participated at Semeval 2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals. Given a text annotated with Part-of-Speech, the system outputs a vector representation of a sentence containing 20 features in total. There are three steps in the system's pipeline: first the system produces an estimation of the entities' position in the relation, then an estimation of the semantic relation type by means of decision trees and finally it gives a predicition of semantic relation plus entities' position. The system obtained good results in the estimation of entities' position (F1=98.3%) but a critically poor performance in relation classification (F1=26.6%), indicating that lexical and semantic information is essential in relation extraction. The system can be used as an integration for other systems or for purposes different from relation extraction.

Fabio Celli

[1] Fabio Celli. Automated Semantc Relation Annotation for Italian and English , 2010 .

[2] Emanuele Pianta,et al. The TextPro Tool Suite , 2008, LREC.

[3] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[4] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[5] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[6] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[7] Karl Rihaczek,et al. 1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[8] Mark A. Hall,et al. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.