Mixed method for extraction of domain terminology from text: Linguistic and statistical filtering

Extraction of identifier terminology from a specific domain is an indispensable task in extracting information from text, In this work we propose a hybrid method of extracting complex terms from Arabic texts which combines between linguistic and statistical approach, which focuses on a linguistic and morph syntactic analysis of the Arabic language deep to introduce an linguistic filtering algorithm of complex terms.

[1]  Christian Jacquemin,et al.  Syntagmatic and Paradigmatic Representations of Term Variation , 1999, ACL.

[2]  Mona T. Diab Improved Arabic Base Phrase Chunking with a new enriched POS tag set , 2007, SEMITIC@ACL.

[3]  Ahmed Abdelali,et al.  Arabic collocations extraction using Gate , 2010, 2010 International Conference on Machine and Web Intelligence.

[4]  Ibrahim Bounhas,et al.  A hybrid approach for Arabic multi-word term extraction , 2009, 2009 International Conference on Natural Language Processing and Knowledge Engineering.

[5]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[6]  Iskandar Keskes,et al.  Segmentation de textes arabes en unités discursives minimales , 2013 .