Two Attempts to Predict Author Gender in Cross-Genre Settings in Dutch

This paper describes the systems designed by the Fraunhofer IAIS team at the CLIN29 shared task on cross-genre gender detection in Dutch. We show two alternative classification approaches: a rather standard one consisting of feature engineering and a random forest classifier; and an alternative one involving a LSTM classifier. Both are enhanced by a LDA model trained on stems. We considered various features such as frequency of function words, parts-of-speech and sentiment among others. We achieved 53.77% average accuracy in the cross-genre settings.