Authorship Attribution in Arabic Poetry Context UsingMarkov Chain classifier

In this paper, we present the Arabic poetry as an authorship attribution task. Several features such as Characters, Sentence length; Word length, Rhyme, and First word in sentence are used as input data for Markov Chain methods. The data is filtered by removing the punctuation and alphanumeric marks that were present in the original text. The data set of experiment was divided into two groups: training dataset with known authors and test dataset with unknown authors. In the experiment, a set of thirty-three poets from different eras have been used. The Experiment shows interesting results with classification precision of 96.96%.

[1]  Mohamed El Bachir Menai,et al.  Naïve Bayes classifiers for authorship attribution of Arabic texts , 2014, J. King Saud Univ. Comput. Inf. Sci..

[2]  Hsinchun Chen,et al.  Applying Authorship Analysis to Arabic Web Content , 2005, ISI.

[3]  Mohammad Awwad AlNagdawi,et al.  Finding Arabic Poem Meter using Context Free Grammar , 2013 .

[4]  Dmitry V. Khmelev,et al.  Using Literal and Grammatical Statistics for Authorship Attribution , 2001, Probl. Inf. Transm..

[5]  Hazel Scott,et al.  Pegs, Cords, and Ghuls: Meter of Classical Arabic Poetry , 2010 .

[6]  Masnizah Mohd,et al.  Text Classification for Authorship Attribution Using Naive Bayes Classifier with Limited Training Data , 2014 .

[7]  Rebhi S. Baraka,et al.  Arabic text author identification using support vector machines , 2014 .

[8]  Dmitry V. Khmelev,et al.  Using Markov Chains for Identification of Writer , 2001, Lit. Linguistic Comput..

[9]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[10]  Kim Luyckx,et al.  Scalability Issues in Authorship Attribution , 2011 .

[11]  Efstathios Stamatatos A survey of modern authorship attribution methods , 2009 .

[12]  Simon Günter,et al.  On Authorship Attribution via Markov Chains and Sequence Kernels , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  David Corne,et al.  Authorship Attribution in Arabic using a hybrid of evolutionary search and linear discriminant analysis , 2010, 2010 UK Workshop on Computational Intelligence (UKCI).

[14]  Stefanos Gritzalis,et al.  Identifying Authorship by Byte-Level N-Grams: The Source Code Author Profile (SCAP) Method , 2007, Int. J. Digit. EVid..

[15]  Abdulrahman Almuhareb,et al.  Recognition of Classical Arabic Poems , 2013, CLfL@NAACL-HLT.

[16]  Justin Zobel,et al.  Effective and Scalable Authorship Attribution Using Function Words , 2005, AIRS.