Semantic role induction in Persian: An unsupervised approach by using probabilistic models

Semantic roles describe the relation between a predicate (typically a verb) and its arguments. Semantic role labeling is a Natural Language Processing task that extracts these relations in the sentences. Different applications such as machine translation and question answering benefit from this level of semantic analysis. The creation of semantic role-annotated data is an obstacle to develop supervised learning systems, so we present a novel unsupervised approach to semantic role induction task. In our approach, which is formulized as a clustering method, the argument instances of the verb are clustered into semantic role classes specified for that verb. We present a Bayesian model for learning argument structure from un-annotated text and estimate the model parameters using expectation maximization method. Clustering of argument instances of a verb, which have semantic and syntactic similarities, can be a promising approach for unsupervised learning of their semantic roles. The only linguistic knowledge, which is prepared for linking the argument instances to semantic clusters is extracted from a verb valance lexicon. Our evaluation results on Persian language show that our system in both small and large training datasets works better than a strong baseline proposed by ([Lang and Lapata 2010][1]) which its idea is developed in Persian. We have used purity and inverse purity measures to assess the quality of the proposed semantic role clustering method. The results indicate the improvement about 9.73 and 1.65% in small dataset and 2.85 and 0.67% in large dataset in purity and inverse purity, respectively. [1]: #ref-20

[1]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[2]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[3]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[4]  Mirella Lapata,et al.  Cross-lingual Annotation Projection for Semantic Roles , 2009, J. Artif. Intell. Res..

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  Xavier Carreras,et al.  Semantic Role Labeling: An Introduction to the Special Issue , 2008, Computational Linguistics.

[7]  Christopher D. Manning,et al.  Unsupervised Discovery of a Statistical Verb Lexicon , 2006, EMNLP.

[8]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[9]  Ari Rappoport,et al.  Unsupervised Argument Identification for Semantic Role Labeling , 2009, ACL.

[10]  Masood Ghayoomi,et al.  Challenges in Developing Persian Corpora from Online Resources , 2009, 2009 International Conference on Asian Language Processing.

[11]  Mohammad Sadegh Rasooli,et al.  Development of a Persian Syntactic Dependency Treebank , 2013, NAACL 2013.

[12]  Brendan T. O'Connor,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics , 2011 .

[13]  Mirella Lapata,et al.  Unsupervised Semantic Role Induction via Split-Merge Clustering , 2011, ACL.

[14]  Suzanne Stevenson,et al.  Unsupervised Semantic Role Labellin , 2004, EMNLP.

[15]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[16]  Mirella Lapata,et al.  Unsupervised Semantic Role Induction with Graph Partitioning , 2011, EMNLP.

[17]  Ann Bies,et al.  A Pilot Arabic Propbank , 2008, LREC.

[18]  Mirella Lapata,et al.  Unsupervised Induction of Semantic Roles , 2010, HLT-NAACL.

[19]  Gholamhossein Karimi-Doostan,et al.  Separability of light verb constructions in Persian , 2011 .

[20]  Ivan Titov,et al.  Semantic Role Labeling , 2010, HLT-NAACL.

[21]  Reid Swanson,et al.  Generalizing semantic role annotations across syntactically similar verbs , 2007, ACL.

[22]  Zahra Abolhassani Chime An Account for Compound Prepositions in Farsi , 2006, ACL.

[23]  Mojgan Seraji,et al.  Bootstrapping a Persian Dependency Treebank , 2012 .

[24]  Ivan Titov,et al.  A Bayesian Approach to Unsupervised Semantic Role Induction , 2012, EACL.

[25]  Wayne H. Ward,et al.  Towards Robust Semantic Role Labeling , 2007, CL.

[26]  Nikhil Garg,et al.  Unsupervised Semantic Role Induction with Global Role Ordering , 2012, ACL.

[27]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[28]  Timothy Baldwin,et al.  Prepositions in Applications: A Survey and Introduction to the Special Issue , 2009, CL.

[29]  Lonneke van der Plas,et al.  Scaling up Automatic Cross-Lingual Semantic Role Annotation , 2011, ACL.

[30]  Christophe Cerisara,et al.  Unsupervised frame based Semantic Role Induction: application to French and English , 2012, SPMRL@ACL 2012.

[31]  Mehrnoush Shamsfard,et al.  Thematic Role Extraction Using Shallow Parsing , 2008 .

[32]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[33]  H. Faili,et al.  Feature engineering using shallow parsing in argument classification of Persian verbs , 2012, The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012).

[34]  Azadeh Kamel Ghalibaf,et al.  Shallow Semantic Parsing of Persian Sentences , 2009, PACLIC.

[35]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[36]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[37]  Mirella Lapata,et al.  Semi-Supervised Semantic Role Labeling via Structural Alignment , 2012, CL.

[38]  Kadri Hacioglu,et al.  Semantic Role Labeling Using Dependency Trees , 2004, COLING.

[39]  Behrouz Minaei-Bidgoli,et al.  An Empirical Study on the Effect of Morphological and Lexical Features in Persian Dependency Parsing , 2013, SPMRL@EMNLP.

[40]  Heshaam Faili,et al.  Unsupervised Identification of Persian Compound Verbs , 2011, MICAI.

[41]  Martha Palmer,et al.  Semantic Mapping Using Automatic Word Alignment and Semantic Role Labeling , 2011, SSST@ACL.

[42]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[43]  Richard Johansson,et al.  A FrameNet-Based Semantic Role Labeler for Swedish , 2006, ACL.