Surveying Stylometry Techniques and Applications

The analysis of authorial style, termed stylometry, assumes that style is quantifiably measurable for evaluation of distinctive qualities. Stylometry research has yielded several methods and tools over the past 200 years to handle a variety of challenging cases. This survey reviews several articles within five prominent subtasks: authorship attribution, authorship verification, authorship profiling, stylochronometry, and adversarial stylometry. Discussions on datasets, features, experimental techniques, and recent approaches are provided. Further, a current research challenge lies in the inability of authorship analysis techniques to scale to a large number of authors with few text samples. Here, we perform an extensive performance analysis on a corpus of 1,000 authors to investigate authorship attribution, verification, and clustering using 14 algorithms from the literature. Finally, several remaining research challenges are discussed, along with descriptions of various open-source and commercial software that may be useful for stylometry subtasks.

[1]  Steven P. Weber,et al.  Active Authentication on Mobile Devices via Stylometry, Application Usage, Web Browsing, and GPS Location , 2017, IEEE Systems Journal.

[2]  Jack Grieve,et al.  Quantitative Authorship Attribution: An Evaluation of Techniques , 2007, Lit. Linguistic Comput..

[3]  Matthew F. Tennyson A Replicated Comparative Study of Source Code Authorship Attribution , 2013, 2013 3rd International Workshop on Replication in Empirical Software Engineering Research.

[4]  Thomas Lavergne,et al.  Tracking Web spam with HTML style similarities , 2008, TWEB.

[5]  Moshe Koppel,et al.  Determining if two documents are written by the same author , 2014, J. Assoc. Inf. Sci. Technol..

[6]  Jenny S Li An Investigation of Authorship Authentication in Short Messages from a Social Networking Site , 2015 .

[7]  Christian Winter,et al.  Authorship verification for different languages, genres and topics , 2016, Digit. Investig..

[8]  Isaac Woungang,et al.  Authorship verification for short messages using stylometry , 2013, 2013 International Conference on Computer, Information and Telecommunication Systems (CITS).

[9]  Lauren M. Stuart,et al.  On Identifying Authors with Style , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[10]  E. Backer,et al.  On musical stylometry — a pattern recognition approach , 2005 .

[11]  Ning Wu,et al.  On Compression-Based Text Classification , 2005, ECIR.

[12]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[13]  Matthew L. Jockers,et al.  A comparative study of machine learning methods for authorship attribution , 2010, Lit. Linguistic Comput..

[14]  Ozlem Yavanoglu,et al.  Intelligent authorship identification with using Turkish newspapers metadata , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[15]  Saira Varghese,et al.  A detection system to counter identity deception in social media applications , 2015, 2015 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2015].

[16]  Paolo Rosso,et al.  The Use of Orthogonal Similarity Relations in the Prediction of Authorship , 2013, CICLing.

[17]  Adriana Kovashka,et al.  Authorship Attribution Using Probabilistic Context-Free Grammars , 2010, ACL.

[18]  D. Holmes,et al.  The Federalist Revisited: New Directions in Authorship Attribution , 1995 .

[19]  Naomie Salim,et al.  Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[20]  Walter Daelemans,et al.  Evaluating Content-Independent Features for Personality Recognition , 2014, WCPR '14.

[21]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[22]  Zhi Liu,et al.  Application of Synergetic Neural Network in Online Writeprint Identification , 2011 .

[23]  T. Raghunadha Reddy,et al.  Empirical Evaluations Using Character and Word N-Grams on Authorship Attribution for Telugu Text , 2015 .

[24]  Hugo Jair Escalante,et al.  Local Histograms of Character N-grams for Authorship Attribution , 2011, ACL.

[25]  Cynthia Whissell,et al.  Traditional and emotional stylometric analysis of the songs of Beatles Paul McCartney and John Lennon , 1996, Comput. Humanit..

[26]  Edward Jimenez,et al.  Exploring Performance-Based Music Attributes for Stylometric Analysis , 2009 .

[27]  Patrick Juola,et al.  Future Trends in Authorship Attribution , 2007, IFIP Int. Conf. Digital Forensics.

[28]  Ingrid Zukerman,et al.  Authorship Attribution with Author-aware Topic Models , 2012, ACL.

[29]  Maciej Eder,et al.  Does size matter? Authorship attribution, small samples, big problem , 2015, Digit. Scholarsh. Humanit..

[30]  Youssef Iraqi,et al.  An evaluation of authorship attribution using random forests , 2015, 2015 International Conference on Information and Communication Technology Research (ICTRC).

[31]  Benno Stein,et al.  Overview of the 2 nd Author Profiling Task at PAN 2014 , 2014 .

[32]  Kyung-Ah Sohn,et al.  A graph model based author attribution technique for single-class e-mail classification , 2015, 2015 IEEE/ACIS 14th International Conference on Computer and Information Science (ICIS).

[33]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[34]  A. Vinaya Babu,et al.  Influence of lexical, syntactic and structural features and their combination on Authorship Attribution for Telugu Text , 2015 .

[35]  Shlomo Argamon,et al.  Computational methods in authorship attribution , 2009, J. Assoc. Inf. Sci. Technol..

[36]  Walter Daelemans,et al.  CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text , 2014, LREC.

[37]  Akshay Java,et al.  The ICWSM 2009 Spinn3r Dataset , 2009 .

[38]  Matthias Hagen,et al.  Who Wrote the Web? Revisiting Influential Author Identification Research Applicable to Information Retrieval , 2016, ECIR.

[39]  Zhenhao Ge,et al.  Domain Specific Author Attribution based on Feedforward Neural Network Language Models , 2016, ICPRAM.

[40]  Roshan G. Ragel,et al.  Authorship detection of SMS messages using unigrams , 2013, 2013 IEEE 8th International Conference on Industrial and Information Systems.

[41]  David J. Harper,et al.  Using compression based language models for text categorization. , 2003 .

[42]  Shlomo Argamon,et al.  Automatically profiling the author of an anonymous text , 2009, CACM.

[43]  Miltos Petridis,et al.  Research and Development in Intelligent Systems XXXII , 2015, Springer International Publishing.

[44]  Shlomo Argamon,et al.  Author Identification on the Large Scale , 2005 .

[45]  B. Kjell,et al.  Authorship attribution of text samples using neural networks and Bayesian classifiers , 1994, Proceedings of IEEE International Conference on Systems, Man and Cybernetics.

[46]  Efstathios Stamatatos,et al.  Authorship Attribution Using Text Distortion , 2017, EACL.

[47]  Constantina Stamou,et al.  Stylochronometry: Stylistic Development, Sequence of Composition, and Relative Dating , 2007, Lit. Linguistic Comput..

[48]  Peter W.H. Smith,et al.  Using genetic algorithms in word-vector optimisation , 2010, 2010 UK Workshop on Computational Intelligence (UKCI).

[49]  Magdalena Jankowska,et al.  CNG Text Classification for Authorship Profiling Task Notebook for PAN at CLEF 2013 , 2013, CLEF.

[50]  F. Mosteller,et al.  A comparative study of discrimination methods applied to the authorship of the disputed Federalist papers , 2016 .

[51]  W. Bossert,et al.  The Measurement of Diversity , 2001 .

[52]  Thamar Solorio,et al.  Authorship attribution of web forum posts , 2010, 2010 eCrime Researchers Summit.

[53]  S. Dongen Graph clustering by flow simulation , 2000 .

[54]  I.N. Bozkurt,et al.  Authorship attribution , 2007, 2007 22nd international symposium on computer and information sciences.

[55]  Vrizlynn L. L. Thing,et al.  Content-centric Age and Gender Profiling Notebook for PAN at CLEF 2013 , 2013, CLEF.

[56]  Mahmoud Al-Ayyoub,et al.  Emotion analysis of Arabic articles and its impact on identifying the author's gender , 2015, 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA).

[57]  Efstathios Stamatatos,et al.  Overview of the Author Identification Task at PAN 2013 , 2013, CLEF.

[58]  Moshe Koppel,et al.  Measuring Differentiability: Unmasking Pseudonymous Authors , 2007, J. Mach. Learn. Res..

[59]  Wei Liu,et al.  Chinese Text Classification without Automatic Word Segmentation , 2007, Sixth International Conference on Advanced Language Processing and Web Information Technology (ALPIT 2007).

[60]  Dale Schuurmans,et al.  Augmenting Naive Bayes Classifiers with Statistical Language Models , 2004, Information Retrieval.

[61]  Benno Stein,et al.  Overview of the PAN/CLEF 2015 Evaluation Lab , 2015, CLEF.

[62]  Flora S. Tsai,et al.  Authorship Identification for Online Text , 2010, 2010 International Conference on Cyberworlds.

[63]  Ildar Z. Batyrshin,et al.  Complete Syntactic N-grams as Style Markers for Authorship Attribution , 2014, MICAI.

[64]  Erbug Çelebi,et al.  Feature Selection for Enhanced Author Identification of Turkish Text , 2015, ISCIS.

[65]  Patrick Juola,et al.  Analyzing Stylometric Approaches to Author Obfuscation , 2011, IFIP Int. Conf. Digital Forensics.

[66]  Jurgita Kapociute-Dzikiene,et al.  Authorship Attribution of Internet Comments with Thousand Candidate Authors , 2015, ICIST.

[67]  Stefano Ferilli,et al.  A Relational Unsupervised Approach to Author Identification , 2013, NFMCP.

[68]  Vittorio Loreto,et al.  Language trees and zipping. , 2002, Physical review letters.

[69]  Walter Daelemans,et al.  Pattern for Python , 2012, J. Mach. Learn. Res..

[70]  Fazli Can,et al.  Change of Word Characteristics in 20th-Century Turkish Literature: A Statistical Analysis , 2010, J. Quant. Linguistics.

[71]  Pushpendra Kumar Pateriya,et al.  A pragmatic validation of stylometric techniques using BPA , 2014, 2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence).

[72]  David I. Holmes,et al.  Neural network applications in stylometry: The Federalist Papers , 1996, Comput. Humanit..

[73]  Rémi de Zoeten Computational Stylometry in Adversarial Settings , 2016 .

[74]  Justin Zobel,et al.  Effective and Scalable Authorship Attribution Using Function Words , 2005, AIRS.

[75]  Fazli Can,et al.  Change of Writing Style with Time , 2004, Comput. Humanit..

[76]  Issa Traoré,et al.  Continuous authentication using micro-messages , 2014, 2014 Twelfth Annual International Conference on Privacy, Security and Trust.

[77]  Efstathios Stamatatos,et al.  Automatic Authorship Attribution , 1999, EACL.

[78]  Ismail Kassou,et al.  Authorship Analysis Studies: A Survey , 2014 .

[79]  Walter Daelemans,et al.  The effect of author set size and data size in authorship attribution , 2011, Lit. Linguistic Comput..

[80]  Benno Stein,et al.  Overview of the Author Identification Task at PAN-2017: Style Breach Detection and Author Clustering , 2017, CLEF.

[81]  Urszula Stanczyk The Class Imbalance Problem in Construction of Training Datasets for Authorship Attribution , 2015, ICMMI.

[82]  Luiz Eduardo Soares de Oliveira,et al.  Selecting syntactic attributes for authorship attribution , 2011, The 2011 International Joint Conference on Neural Networks.

[83]  Efstathios Stamatatos,et al.  N-Gram Feature Selection for Authorship Identification , 2006, AIMSA.

[84]  Khaled Rasheed,et al.  Using Machine Learning Techniques for Stylometry , 2004, IC-AI.

[85]  Dmitry V. Khmelev,et al.  Using Literal and Grammatical Statistics for Authorship Attribution , 2001, Probl. Inf. Transm..

[86]  Steven Benzel,et al.  A simple stylometric comparator: nifty assignment , 2015 .

[87]  Michael Oakes,et al.  Literary Detective Work on the Computer , 2014 .

[88]  Efstathios Stamatatos,et al.  Tensor Space Models for Authorship Identification , 2008, SETN.

[89]  Dominique Labbé,et al.  Experiments on authorship attribution by intertextual distance in english* , 2007, J. Quant. Linguistics.

[90]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[91]  Shlomo Argamon,et al.  A Mathematical Explanation of Burrows ’ s Delta ∗ , 2022 .

[92]  Shannon M. Hughes,et al.  Stylistic analysis of paintings usingwavelets and machine learning , 2009, 2009 17th European Signal Processing Conference.

[93]  Richard Dazeley,et al.  Authorship Attribution for Twitter in 140 Characters or Less , 2010, 2010 Second Cybercrime and Trustworthy Computing Workshop.

[94]  Prabaharan Poornachandran,et al.  Stylometry detection using deep learning , 2017 .

[95]  Albert Ali Salah,et al.  Authorship recognition in a multiparty chat scenario , 2016, 2016 4th International Conference on Biometrics and Forensics (IWBF).

[96]  Cyril Labbé,et al.  Inter-Textual Distance and Authorship Attribution Corneille and Molière , 2001, J. Quant. Linguistics.

[97]  Walter Daelemans,et al.  Personae: a Corpus for Author and Personality Prediction from Text , 2008, LREC.

[98]  Efstathios Stamatatos,et al.  Universality of Stylistic Traits in Texts , 2016 .

[99]  Li Lin,et al.  Identifying Gender of Microblog Users Based on Message Mining , 2014, WAIM.

[100]  Benno Stein,et al.  Intrinsic plagiarism analysis , 2011, Lang. Resour. Evaluation.

[101]  George Giannakopoulos,et al.  Author Profiling using Stylometric and Structural Feature Groupings , 2015, CLEF.

[102]  Morten Nielsen,et al.  Stylometry of paintings using hidden Markov modelling of contourlet transforms , 2013, Signal Process..

[103]  Jonathan H. Clark,et al.  A Classifier System for Author Recognition Using Synonym-Based Features , 2007, MICAI.

[104]  Geoffrey Sampson,et al.  Word frequency distributions , 2002, Computational Linguistics.

[105]  John Burrows,et al.  'Delta': a Measure of Stylistic Difference and a Guide to Likely Authorship , 2002, Lit. Linguistic Comput..

[106]  John Lennon,et al.  Traditional and Emotional Stylometric Analysis of the Songs of Beatles , 2004 .

[107]  Charles C. Tappert,et al.  A Stylometry System for Authenticating Students Taking Online Tests , 2011 .

[108]  Jan Svec,et al.  Slavonic Corpus for Stylometry Research , 2015, RASLAN.

[109]  Moshe Koppel,et al.  Authorship verification as a one-class classification problem , 2004, ICML.

[110]  Cyril Labbé,et al.  A Tool for Literary Studies: Intertextual Distance and Tree Classification , 2005, Lit. Linguistic Comput..

[111]  Jörg Kindermann,et al.  Authorship Attribution with Support Vector Machines , 2003, Applied Intelligence.

[112]  Efstathios Stamatatos,et al.  Syntactic N-grams as machine learning features for natural language processing , 2014, Expert Syst. Appl..

[113]  Shlomo Argamon,et al.  Overview of the International Authorship Identification Competition at PAN-2011 , 2011, CLEF.

[114]  A. O. Kusakci,et al.  Authorship attribution using committee machines with k-nearest neighbors rated voting , 2012, 11th Symposium on Neural Network Applications in Electrical Engineering.

[115]  Paolo Rosso,et al.  Authorship Attribution Using Word Sequences , 2006, CIARP.

[116]  Bei Yu,et al.  Function Words for Chinese Authorship Attribution , 2012, CLfL@NAACL-HLT.

[117]  George M. Mohay,et al.  Mining e-mail content for author identification forensics , 2001, SGMD.

[118]  Shlomo Argamon,et al.  Effects of Age and Gender on Blogging , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[119]  Joseph Rudman,et al.  The State of Authorship Attribution Studies: Some Problems and Solutions , 1997, Comput. Humanit..

[120]  Khedija Arour,et al.  A Binary Decision Diagram to discover low threshold support frequent itemsets , 2007 .

[121]  Mahmoud Al-Ayyoub,et al.  An extensive study of the Bag-of-Words approach for gender identification of Arabic articles , 2014, 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA).

[122]  Efstathios Stamatatos,et al.  Authorship Attribution Based on Feature Set Subspacing Ensembles , 2006, Int. J. Artif. Intell. Tools.

[123]  Benno Stein,et al.  Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations , 2016, CLEF.

[124]  Michael Gamon,et al.  Obfuscating Document Stylometry to Preserve Author Anonymity , 2006, ACL.

[125]  Walter Daelemans,et al.  Explanation in Computational Stylometry , 2013, CICLing.

[126]  R. Forsyth Stylochronometry with substrings, or : a poet young and old , 1999 .

[127]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[128]  Stefan Gruner,et al.  Tool support for plagiarism detection in text documents , 2005, SAC '05.

[129]  Dawn Xiaodong Song,et al.  On the Feasibility of Internet-Scale Author Identification , 2012, 2012 IEEE Symposium on Security and Privacy.

[130]  J. Milton,et al.  Language Independent Authorship Attribution using Character Level Language Models , 2003 .

[131]  Cindy K. Chung,et al.  The Psychological Functions of Function Words , 2007 .

[132]  Rachel Greenstadt,et al.  Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity , 2012, TSEC.

[133]  Grigori Sidorov,et al.  Syntactic N-grams as Features for the Author Profiling Task: Notebook for PAN at CLEF 2015 , 2015, CLEF.

[134]  C. E. Veni Madhavan,et al.  Stopword Graphs and Authorship Attribution in Text Corpora , 2009, 2009 IEEE International Conference on Semantic Computing.

[135]  Efstathios Stamatatos,et al.  A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[136]  Robert Levinson,et al.  Automatic Synonym and Phrase Replacement Show Promise for Style Transformation , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[137]  William John Teahan,et al.  A repetition based measure for verification of text collections and for text categorization , 2003, SIGIR.

[138]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[139]  T. Raghunadha Reddy,et al.  A Survey on Authorship Profiling Techniques , 2016 .

[140]  Jacques Savoy,et al.  Comparative evaluation of term selection functions for authorship attribution , 2015, Digit. Scholarsh. Humanit..

[141]  Graeme Hirst,et al.  Authorship Verification with Entity Coherence and Other Rich Linguistic Features Notebook for PAN at CLEF 2013 , 2013, CLEF.

[142]  Shireen Panchoo,et al.  Gender Profiling from PhD Theses Using k-Nearest Neighbour and Sequential Minimal Optimisation , 2016 .

[143]  David I. Holmes,et al.  Who Was the Author? An Introduction to Stylometry , 2003 .

[144]  H. van Halteren,et al.  Outside the cave of shadows: using syntactic annotation to enhance authorship attribution , 1996 .

[145]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[146]  Benno Stein,et al.  Overview of the Author Profiling Task at PAN 2013 , 2013, CLEF.

[147]  Walter Daelemans,et al.  Predicting age and gender in online social networks , 2011, SMUC '11.

[148]  Walter Daelemans,et al.  TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling , 2016, LREC.

[149]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[150]  Robert Matthews,et al.  Neural Computation in Stylometry I: An Application to the Works of Shakespeare and Fletcher , 1993 .

[151]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[152]  George K. Mikros Authorship Attribution and Gender Identification in Greek Blogs , 2013 .

[153]  Mohammad S. Obaidat,et al.  Authorship verification using deep belief network systems , 2017, Int. J. Commun. Syst..

[154]  D. Holmes The Evolution of Stylometry in Humanities Scholarship , 1998 .

[155]  Ying Li,et al.  A Cybercrime Forensic Method for Chinese Web Information Authorship Analysis , 2009, PAISI.

[156]  David L. Hoover,et al.  Another Perspective on Vocabulary Richness , 2003, Comput. Humanit..

[157]  Hsinchun Chen,et al.  Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace , 2008, TOIS.

[158]  Efstathios Stamatatos,et al.  Computer-Based Authorship Attribution Without Lexical Measures , 2001, Comput. Humanit..

[159]  Shlomo Argamon,et al.  Interpreting Burrows's Delta: Geometric and Probabilistic Foundations , 2007, Lit. Linguistic Comput..

[160]  Rangsipan Marukatat,et al.  Authorship Attribution Analysis of Thai Online Messages , 2014, 2014 International Conference on Information Science & Applications (ICISA).

[161]  Moshe Koppel,et al.  Exploiting Stylistic Idiosyncrasies for Authorship Attribution , 2003 .

[162]  Fuchun Peng,et al.  N-GRAM-BASED AUTHOR PROFILES FOR AUTHORSHIP ATTRIBUTION , 2003 .

[163]  Moshe Kam,et al.  Decision Fusion for Multimodal Active Authentication , 2013, IT Professional.

[164]  Rong Zheng,et al.  A framework for authorship identification of online messages: Writing-style features and classification techniques , 2006, J. Assoc. Inf. Sci. Technol..

[165]  F. Mosteller,et al.  Inference in an Authorship Problem , 1963 .

[166]  Carl Vogel,et al.  Stylochronometry: Timeline Prediction in Stylometric Analysis , 2015, SGAI Conf..

[167]  Yong Tang,et al.  Authorship Attribution with Topic Drift Model , 2017, AAAI.

[168]  Sangmi Shin,et al.  Using frame semantics in authorship attribution , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[169]  Ben Verhoeven,et al.  Gender Profiling for Slovene Twitter communication: the Influence of Gender Marking, Content and Style , 2017, BSNLP@EACL.

[170]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[171]  Murray R. Barrick,et al.  THE BIG FIVE PERSONALITY DIMENSIONS AND JOB PERFORMANCE: A META-ANALYSIS , 1991 .

[172]  Markus Krause A behavioral biometrics based authentication method for MOOC's that is robust against imitation attempts , 2014, L@S.

[173]  Ariel Stolerman,et al.  Doppelgänger Finder: Taking Stylometry to the Underground , 2014, 2014 IEEE Symposium on Security and Privacy.

[174]  Mikhail B. Malyutov,et al.  Authorship attribution of texts: a review , 2005, Electron. Notes Discret. Math..

[175]  Efstathios Stamatatos,et al.  Author Identification Using Imbalanced and Limited Training Texts , 2007, 18th International Workshop on Database and Expert Systems Applications (DEXA 2007).

[176]  Dinh Phuc Nguyen Obfuscation techniques for Java source code , 2014 .

[177]  F. Mosteller,et al.  Inference and Disputed Authorship: The Federalist , 1966 .

[178]  Benno Stein,et al.  Overview of the 3rd Author Profiling Task at PAN 2015 , 2015, CLEF.

[179]  Shlomo Argamon,et al.  Authorship attribution in the wild , 2010, Lang. Resour. Evaluation.

[180]  Patrick Juola,et al.  An Overview of the Traditional Authorship Attribution Subtask , 2012, CLEF.

[181]  Mostafa Bellafkih,et al.  Authorship attribution in Arabic poetry , 2015, 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA).

[182]  Benno Stein,et al.  Plagiarism analysis, authorship identification, and near-duplicate detection PAN'07 , 2007, SIGF.

[183]  Stefanos Gritzalis,et al.  Identifying Authorship by Byte-Level N-Grams: The Source Code Author Profile (SCAP) Method , 2007, Int. J. Digit. EVid..

[184]  Walter Daelemans,et al.  Authorship Attribution and Verification with Many Authors and Limited Data , 2008, COLING.