Authorship Verification with neural networks via stylometric feature concatenation

In the authorship verification task (PAN at CLEF 2021) the main aim is to discriminate between pairs of texts written by the same author or by two different authors. Our work focuses on extracting two stylometric features, character-level n-grams and the use of punctuation marks in the texts. Subsequently, we train a neural network with each of them and finally combine them into a final neural network for the classifier decision making.

[1]  Ana M. García-Serrano,et al.  Some Results Using Different Approaches to Merge Visual and Text-Based Features in CLEF'08 Photo Collection , 2008, CLEF.

[2]  Efstathios Stamatatos,et al.  A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[3]  Benno Stein,et al.  Overview of PAN 2019: Bots and Gender Profiling, Celebrity Profiling, Cross-Domain Authorship Attribution and Style Change Detection , 2019, CLEF.

[4]  Christian Winter,et al.  Authorship verification for different languages, genres and topics , 2016, Digit. Investig..

[5]  E. Sezer,et al.  Deep Combination of Stylometry Features in Forensic Authorship Analysis , 2020 .

[6]  Paolo Rosso,et al.  Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection - Extended Abstract , 2021, ECIR.

[7]  Graeme Hirst,et al.  Bigrams of Syntactic Labels for Authorship Discrimination of Short Texts , 2007, Lit. Linguistic Comput..

[8]  Moshe Koppel,et al.  Determining if two documents are written by the same author , 2014, J. Assoc. Inf. Sci. Technol..

[9]  Ana M. García-Serrano,et al.  Experiences at ImageCLEF 2010 using CBIR and TBIR Mixing Information Approaches , 2010, CLEF.

[10]  Siamese Network applied to Authorship Verification Notebook for PAN at CLEF 2020 , 2020 .

[11]  Robert M. Nickel,et al.  Deep Bayes Factor Scoring for Authorship Verification , 2020, CLEF.

[12]  Janith Weerasinghe,et al.  Feature Vector Difference based Neural Network and Logistic Regression Models for Authorship Verification , 2020, CLEF.

[13]  I.N. Bozkurt,et al.  Authorship attribution , 2007, 2007 22nd international symposium on computer and information sciences.

[14]  Benno Stein,et al.  TIRA Integrated Research Architecture , 2019, Information Retrieval Evaluation in a Changing World.

[15]  Hsinchun Chen,et al.  Applying authorship analysis to extremist-group Web forum messages , 2005, IEEE Intelligent Systems.

[16]  José Luis Martínez-Fernández,et al.  Combining Textual and Visual Features for Image Retrieval , 2005, CLEF.

[17]  Venu Govindaraju,et al.  Cognitive-Biometric Recognition From Language Usage: A Feasibility Study , 2017, IEEE Transactions on Information Forensics and Security.

[18]  Benno Stein,et al.  Overview of PAN 2018 - Author Identification, Author Profiling, and Author Obfuscation , 2018, CLEF.

[19]  Catherine Ikae UniNE at PAN-CLEF 2020: Author Verification Notebook for PAN at CLEF 2020 , 2020 .

[20]  Martin Potthast,et al.  Overview of the Cross-Domain Authorship Verification Task at PAN 2020 , 2020, CLEF.