论文信息 - Mixed method for extraction of domain terminology from text: Linguistic and statistical filtering

Mixed method for extraction of domain terminology from text: Linguistic and statistical filtering

Extraction of identifier terminology from a specific domain is an indispensable task in extracting information from text, In this work we propose a hybrid method of extracting complex terms from Arabic texts which combines between linguistic and statistical approach, which focuses on a linguistic and morph syntactic analysis of the Arabic language deep to introduce an linguistic filtering algorithm of complex terms.

Abdelaziz Marzak | El Habib Ben Lahmar | El Khadir Lamrani | Hammad Ballaoui

[1] Christian Jacquemin,et al. Syntagmatic and Paradigmatic Representations of Term Variation , 1999, ACL.

[2] Mona T. Diab. Improved Arabic Base Phrase Chunking with a new enriched POS tag set , 2007, SEMITIC@ACL.

[3] Ahmed Abdelali,et al. Arabic collocations extraction using Gate , 2010, 2010 International Conference on Machine and Web Intelligence.

[4] Ibrahim Bounhas,et al. A hybrid approach for Arabic multi-word term extraction , 2009, 2009 International Conference on Natural Language Processing and Knowledge Engineering.

[5] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[6] Iskandar Keskes,et al. Segmentation de textes arabes en unités discursives minimales , 2013 .