ABI Neural Ensemble Model for Gender Prediction

We present our system for the CLIN29 shared task on cross-genre gender detection for Dutch. We experimented with a multitude of neural models (CNN, RNN, LSTM, etc.), more “traditional” models (SVM, RF, LogReg, etc.), different feature sets as well as data pre-processing. The final results suggested that using tokenized, non-lowercased data works best for most of the neural models, while a combination of word clusters, character trigrams and word lists showed to be most beneficial for the majority of the more “traditional” (that is, non-neural) models, beating features used in previous tasks such as ngrams, character n-grams, part-of-speech tags and combinations thereof. In contradiction with the results described in previous comparable shared tasks, our neural models performed better than our best traditional approaches with our best feature set-up. Our final model consisted of a weighted ensemble model combining the top 25 models. Our final model won both the in-domain gender prediction task and the cross-genre challenge, achieving an average accuracy of 64.93% on the in-domain gender prediction task, and 56.26% on cross-genre gender prediction.

[1]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[2]  Walter Daelemans,et al.  CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text , 2014, LREC.

[3]  Karen Keune,et al.  Explaining register and sociolinguistic variation in the lexicon: Corpus studies on Dutch , 2012 .

[4]  Malvina Nissim,et al.  An Analysis of Cross-Genre and In-Genre Performance for Author Profiling in Social Media , 2017, CLEF.

[5]  Andy Way,et al.  Getting Gender Right in Neural Machine Translation , 2019, EMNLP.

[6]  Benno Stein,et al.  Overview of the 3rd Author Profiling Task at PAN 2015 , 2015, CLEF.

[7]  Tommaso Caselli,et al.  Evalita 2018: Overview on the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian , 2018, EVALITA@CLiC-it.

[8]  Paolo Rosso,et al.  Overview of the RUSProfiling PAN at FIRE Track on Cross-genre Gender Identification in Russian , 2017, FIRE.

[9]  Benno Stein,et al.  Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations , 2016, CLEF.

[10]  Walter Daelemans,et al.  TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling , 2016, LREC.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[13]  Hugo Jair Escalante,et al.  INAOE's Participation at PAN'15: Author Profiling task , 2015, CLEF.

[14]  Angelo Basile,et al.  CapetownMilanoTirana for GxG at Evalita2018. Simple N-gram Based Models Perform Well for Gender Prediction. Sometimes. (Short Paper) , 2018, EVALITA@CLiC-it.

[15]  Benno Stein,et al.  Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter , 2017, CLEF.

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..