Some Experiments in Humour Recognition Using the Italian Wikiquote Collection

In this paper we present some results obtained in humour classification over a corpus of Italian quotations manually extracted and tagged from the Wikiquote project. The experiments were carried out using both a multinomial Naive Bayes classifier and a Support Vector Machine (SVM). The considered features range from single words to n-grams and sentence length. The obtained results show that it is possible to identify the funny quotes even with the simplest features (bag of words); the bayesian classifier performed better than the SVM. However, the size of the corpus size is too small to support definitive assertions.