This report describes the UNITN system, a Part-Of-Speech Context Counter, that participated at Semeval 2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals. Given a text annotated with Part-of-Speech, the system outputs a vector representation of a sentence containing 20 features in total. There are three steps in the system's pipeline: first the system produces an estimation of the entities' position in the relation, then an estimation of the semantic relation type by means of decision trees and finally it gives a predicition of semantic relation plus entities' position. The system obtained good results in the estimation of entities' position (F1=98.3%) but a critically poor performance in relation classification (F1=26.6%), indicating that lexical and semantic information is essential in relation extraction. The system can be used as an integration for other systems or for purposes different from relation extraction.
[1]
Fabio Celli.
Automated Semantc Relation Annotation for Italian and English
,
2010
.
[2]
Emanuele Pianta,et al.
The TextPro Tool Suite
,
2008,
LREC.
[3]
William W. Cohen.
Fast Effective Rule Induction
,
1995,
ICML.
[4]
J. Ross Quinlan,et al.
C4.5: Programs for Machine Learning
,
1992
.
[5]
Ian H. Witten,et al.
Data mining: practical machine learning tools and techniques with Java implementations
,
2002,
SGMD.
[6]
Ian H. Witten,et al.
Data mining: practical machine learning tools and techniques, 3rd Edition
,
1999
.
[7]
Karl Rihaczek,et al.
1. WHAT IS DATA MINING?
,
2019,
Data Mining for the Social Sciences.
[8]
Mark A. Hall,et al.
Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning
,
1999,
ICML.