A Graph-Based Ranking Model for Automatic Keyphrases Extraction from Arabic Documents

Automatic keyphrases extraction is to extract a set of phrases that are related to the main topics discussed in a document. They have served in several areas of text mining such as information retrieval and classification of a large text collection. Consequently, they have proved their effectiveness. Due to its importance, automatic keyphrases extraction from Arabic documents has received a lot of attention. For instance, the KP-Miner system was proposed to extract Arabic keyphrases, and demonstrates through experimentation and comparison with other systems its effectiveness. In this paper, we introduce TextRank, a graph-based ranking model, used successfully in many tasks of text processing, to compute term weights from graphs of documents. Vertices represent the document’s terms, and edges represent term co-occurrence within a fixed window. It is an innovative unsupervised method that we have adapted to extract Arabic keyphrases, and assess its effectiveness. The obtained results with TextRank are compared with those obtained with KPMiner, owing to the fact that both systems do not need a training step.

[1]  Nazlia Omar,et al.  Arabic keyphrases extraction using a hybrid of statistical and machine learning methods , 2014, Proceedings of the 6th International Conference on Information Technology and Multimedia.

[2]  Driss Mammass,et al.  Stemming versus multi-words indexing for Arabic documents classification , 2016, 2016 11th International Conference on Intelligent Systems: Theories and Applications (SITA).

[3]  Zhiyuan Liu,et al.  Clustering to Find Exemplar Terms for Keyphrase Extraction , 2009, EMNLP.

[4]  Tarek El-Shishtawy,et al.  Arabic Keyphrase Extraction using Linguistic knowledge and Machine Learning Techniques , 2012, ArXiv.

[5]  Ismail Hmeidi,et al.  Automatic Keyphrase Extractor from Arabic Documents , 2016 .

[6]  Kamal Sarkar A Hybrid Approach to Extract Keyphrases from Medical Documents , 2013, ArXiv.

[7]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[8]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[9]  Ahmed A. Rafea,et al.  KP-Miner: A keyphrase extraction system for English and Arabic documents , 2009, Inf. Syst..

[10]  Amir F. Atiya,et al.  New Approaches for Extracting Arabic Keyphrases , 2015, 2015 First International Conference on Arabic Computational Linguistics (ACLing).