NATURAL LANGUAGE PROCESSING BASED INFORMATION RETRIEVAL FOR THE PURPOSE OF AUTHOR IDENTIFICATION

With the increasing widespread use of computers and the internet large amount of informations are becoming available on the web. Automatic information processing and retrieval are therefore an urgent need In this paper a new approach to automatic authorship identification dealing with real-world text (or unrestricted) is presented. A different approach to identify the author using initial character N- gram is proposed.

[1]  E. Stamatatos Ensemble-based Author Identification Using Character N-grams , 2006 .

[2]  Dan Klein,et al.  Named Entity Recognition with Character-Level Models , 2003, CoNLL.

[3]  Naushad UzZaman,et al.  Analysis of N-Gram based text categorization for Bangla in a newspaper , 2006 .

[4]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[5]  Efstathios Stamatatos,et al.  Webpage Genre Identification Using Variable-Length Character n-Grams , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[6]  Rong Zheng,et al.  A framework for authorship identification of online messages: Writing-style features and classification techniques , 2006, J. Assoc. Inf. Sci. Technol..

[7]  Efstathios Stamatatos,et al.  A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[8]  Shlomo Argamon,et al.  Computational methods in authorship attribution , 2009, J. Assoc. Inf. Sci. Technol..

[9]  Johannes Fürnkranz,et al.  A Study Using $n$-gram Features for Text Categorization , 1998 .

[10]  Efstathios Stamatatos,et al.  Intrinsic Plagiarism Detection Using Character n-gram Profiles , 2009 .

[11]  Paolo Rosso,et al.  Authorship Attribution Using Word Sequences , 2006, CIARP.

[12]  Agha Ali Raza,et al.  N-Gram Based Authorship Attribution in Urdu Poetry , 2009 .

[13]  Dale Schuurmans,et al.  Augmenting Naive Bayes Classifiers with Statistical Language Models , 2004, Information Retrieval.

[14]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[15]  D. Holmes,et al.  The Federalist Revisited: New Directions in Authorship Attribution , 1995 .

[16]  P. Nather N-Gram based Text Categorization , 2005 .

[17]  J. Milton,et al.  Language Independent Authorship Attribution using Character Level Language Models , 2003 .

[18]  Efstathios Stamatatos,et al.  N-Gram Feature Selection for Authorship Identification , 2006, AIMSA.

[19]  Hsinchun Chen,et al.  Applying authorship analysis to extremist-group Web forum messages , 2005, IEEE Intelligent Systems.

[20]  Moshe Koppel,et al.  Determining an author's native language by mining a text for errors , 2005, KDD '05.

[21]  Fuchun Peng,et al.  N-GRAM-BASED AUTHOR PROFILES FOR AUTHORSHIP ATTRIBUTION , 2003 .

[22]  Wessel Kraaij,et al.  A Shallow Approach to Subjectivity Classification , 2008, ICWSM.

[23]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.