An Ensemble Learning Based Author Identification System

Author identification is an emerging domain in the area of Natural Language Processing (NLP) that allows us to identify the respective author of a particular piece of text. Every author had some unique characteristics of writing that involves their signature style of applying specific terms, making their piece of art distinct and noticeable and also there exists an extended story behind the linguistic and stylistic analysis in the identification of authors. In this paper we aim to produce a content resemblance based author identification system (AIS) using ensemble learning model for identification of author for a given piece of text. The experiment was tested on approximately 6,000 passages from 26 authors obtained from Bangla literature. Experimental results reported that the proposed technique performed better compare to the state-of-the art methods.

[1]  Abdus Salam,et al.  Authorship Attribution for Bengali Language Using the Fusion of N-Gram and Naive Bayes Algorithms , 2018, International Journal of Information Technology and Computer Science.

[2]  Urmila Shrawankar,et al.  Transliteration of Secured SMS to Indian Regional Language , 2016 .

[3]  Shibamouli Lahiri,et al.  Authorship Attribution in Bengali Language , 2015, ICON.

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Kale Sunil Digamberrao,et al.  Author Identification using Sequential Minimal Optimization with rule-based Decision Tree on Indian Literature in Marathi , 2018 .

[6]  Pabitra Mitra,et al.  Author Identification in Bengali Literary Works , 2011, PReMI.

[7]  Ahmed Fawzi Otoom,et al.  Towards author identification of Arabic text articles , 2014, 2014 5th International Conference on Information and Communication Systems (ICICS).

[8]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Gholamreza Haffari,et al.  Automated Analysis of Bangla Poetry for Classification and Poet Identification , 2015, ICON.

[10]  Diana Inkpen,et al.  Authorship Identification for Literary Book Recommendations , 2018, COLING.

[11]  Tanmoy Chakraborty,et al.  Authorship identification in Bengali language: A graph based approach , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[12]  Nektaria Potha,et al.  A Profile-Based Method for Authorship Verification , 2014, SETN.