A Prototype for Authorship Attribution Studies

Despite a century of research, statistical and computational methods for authorship attribution are neither reliable, well-regarded, widely used, or well-understood. This article presents a survey of the current state of the art as well as a framework for uniform and unified development of a tool to apply the state of the art, despite the wide variety of methods and techniques used. The usefulness of the framework is confirmed by the development of a tool using that framework that can be applied to authorship analysis by researchers without a computing specialization. Using this tool, it may be possible both to expand the pool of available researchers as well as to enhance the quality of the overall solutions [for example, by incorporating improved algorithms as discovered through empirical analysis (Juola, P. (2004a). Ad-hoc Authorship Attribution Competition. In Proceedings 2004 Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities (ALLC/ACH 2004), Goteborg, Sweden)].

[1]  Douglas R. Stinson,et al.  Cryptography: Theory and Practice , 1995 .

[2]  I.N. Bozkurt,et al.  Authorship attribution , 2007, 2007 22nd international symposium on computer and information sciences.

[3]  Patrick Juola,et al.  The Time Course of Language Change , 2003, Comput. Humanit..

[4]  Patrick Juola,et al.  Becoming Jack London* , 2007, J. Quant. Linguistics.

[5]  H. T. Eddy The characteristic curves of composition. , 1887, Science.

[6]  Patrick Juola,et al.  A Controlled-corpus Experiment in Authorship Identification by Cross-entropy , 2003 .

[7]  N. Cercone CNG Method with Weighted Voting , 2004 .

[8]  D. Holmes The Evolution of Stylometry in Humanities Scholarship , 1998 .

[9]  Patrick Juola,et al.  Authorship Attribution , 2008, Found. Trends Inf. Retr..

[10]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[11]  R. H. Baayen,et al.  An experiment in authorship attribution , 2002 .

[12]  P. Ladefoged A course in phonetics , 1975 .

[13]  A. Q. Morton,et al.  Analysing for authorship : a guide to the cusum technique , 1996 .

[14]  John F. Burrows,et al.  ‘An ocean where each kind. . .’: Statistical analysis and some major determinants of literary style , 1989, Comput. Humanit..

[15]  Hans van Halteren,et al.  New Machine Learning Methods Demonstrate the Existence of a Human Stylome , 2005, J. Quant. Linguistics.

[16]  John Burrows,et al.  Questions of Authorship: Attribution and Beyond A Lecture Delivered on the Occasion of the Roberto Busa Award ACH-ALLC 2001, New York , 2003, Comput. Humanit..

[17]  Dmitry V. Khmelev,et al.  Using Literal and Grammatical Statistics for Authorship Attribution , 2001, Probl. Inf. Transm..

[18]  Joseph N. Ulman,et al.  The Art of Cross-Examination , 1936 .