Feature engineering using shallow parsing in argument classification of Persian verbs

Identifying the verb's dependents and determining the semantic role for them is a natural pre-processing step in applications such as machine translation (MT) and question answering (QA). In this paper, we present a feature set for assigning argument instances into thematic role classes such as “Agent” and “Patient”. This feature set contains mainly language specific features for syntactic segments (chunks) of Persian sentences which can be categorized into three feature types including verb properties, chunk content and relation between the argument and verb of a sentence. We train an instance-based classifier on our manually annotated dataset to select the appropriate semantic role of each chunk. The classifier discriminates the best semantic role without considering the interaction between chunks in a sentence. The results show that our feature set discriminates the thematic roles of arguments in a considerable accuracy about 81.9% which enhances the baseline accuracy about 18.8%. Our dataset is free release and available for the researchers.

[1]  Mehrnoush Shamsfard,et al.  Thematic Role Extraction Using Shallow Parsing , 2008 .

[2]  Zahra Abolhassani Chime An Account for Compound Prepositions in Farsi , 2006, ACL.

[3]  Wayne H. Ward,et al.  Towards Robust Semantic Role Labeling , 2007, CL.

[4]  Azadeh Kamel Ghalibaf,et al.  Shallow Semantic Parsing of Persian Sentences , 2009, PACLIC.

[5]  Nianwen Xue,et al.  Labeling Chinese Predicates with Semantic Roles , 2008, CL.

[6]  Description of S QUASH , the SFU Question Answering Summary Handler for the DUC-2006 Summarization Task , 2005 .

[7]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[8]  Daniel Jurafsky,et al.  Support Vector Learning for Semantic Argument Classification , 2005, Machine Learning.

[9]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[10]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[11]  Walter Daelemans,et al.  Memory-Based Language Processing , 2009, Studies in natural language processing.

[12]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[13]  Xavier Carreras,et al.  Semantic Role Labeling: An Introduction to the Special Issue , 2008, Computational Linguistics.

[14]  Heshaam Faili,et al.  Unsupervised Identification of Persian Compound Verbs , 2011, MICAI.

[15]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[16]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[17]  Eneko Agirre,et al.  Improving Semantic Role Classification with Selectional Preferences , 2010, HLT-NAACL.

[18]  Sanda M. Harabagiu,et al.  Using Predicate-Argument Structures for Information Extraction , 2003, ACL.

[19]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[20]  Meng Wang,et al.  Chinese Semantic Role Labeling with Shallow Parsing , 2009, EMNLP.

[21]  Martha Palmer,et al.  Semantic Mapping Using Automatic Word Alignment and Semantic Role Labeling , 2011, SSST@ACL.

[22]  Ivan Titov,et al.  Semantic Role Labeling , 2010, HLT-NAACL.

[23]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[24]  Mohammad Sadegh Rasooli,et al.  A Syntactic Valency Lexicon for Persian Verbs : The First Steps towards Persian Dependency Treebank , 2012 .

[25]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.