UniNE at CLEF 2015 Author Profiling: Notebook for PAN at CLEF 2015

This paper describes and evaluates an effective author profiling model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, Italian, and Spanish) in Twitter tweets. As features, we suggest using the 200 most frequent terms of the query text (isolated words and punctuation symbols). Applying a simple distance measure and looking at the three nearest neighbors, we can determine the gender (with the nominal values male and female), the age group (with the ordinal measurement 18-24|25-34|35-49|>50), and the Big Five personality traits (extraversion, neuroticism, agreeableness, conscientiousness, and openness on an interval scale containing eleven items). Evaluations are based on four test collections (PAN AUTHOR PROFILING task at CLEF 2015).