论文信息 - Some Experiments in Humour Recognition Using the Italian Wikiquote Collection

Some Experiments in Humour Recognition Using the Italian Wikiquote Collection

In this paper we present some results obtained in humour classification over a corpus of Italian quotations manually extracted and tagged from the Wikiquote project. The experiments were carried out using both a multinomial Naive Bayes classifier and a Support Vector Machine (SVM). The considered features range from single words to n-grams and sentence length. The obtained results show that it is possible to identify the funny quotes even with the simplest features (bag of words); the bayesian classifier performed better than the SVM. However, the size of the corpus size is too small to support definitive assertions.

Paolo Rosso | Davide Buscaldi | Paolo Rosso | D. Buscaldi

[1] José Francisco Martínez-Trinidad,et al. Progress in Pattern Recognition, Image Analysis and Applications, 12th Iberoamericann Congress on Pattern Recognition, CIARP 2007, Valparaiso, Chile, November 13-16, 2007, Proceedings , 2008, CIARP.

[2] Carlo Strapparava,et al. Technologies That Make You Smile: Adding Humor to Text-Based Applications , 2006, IEEE Intelligent Systems.

[3] Paolo Rosso,et al. Authorship Attribution Using Word Sequences , 2006, CIARP.

[4] Edward De Bono,et al. I Am Right You Are Wrong: From This to the New Renaissance: From Rock Logic to Water Logic , 1990 .

[5] Ian Witten,et al. Data Mining , 2000 .

[6] Kazuhisa Miwa,et al. Computational Laughing: Automatic Recognition of Humorous One-Liners , 2005 .

[7] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[8] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.