Human Judgment on Humor Expressions in a Community-Based Question-Answering Service

For understanding humorous dialogue, a collection of humorous expressions is needed. In addition to humorous expressions, their annotations are important to be used as language resources. In this paper, we analyzed how human assessors annotate humorous expressions extracted from an online community-based questionanswering (CQA) corpus, which contains many interesting examples of humorous communication. We analyzed the annotation results of a collection of humorous expressions as done by 28 annotators in terms of the degree of humor and categorization of humor. We found the assessments to be quite subjective, and only marginal inter-annotator agreements were observed. This result suggests that the variability in humor annotations is not noise resulting from erroneous assessment but is rooted in personality differences of the annotators. It would be necessary to incorporate the individual differences in humor perception for properly utilizing the resources. We discuss the possibility to improve the collection process by applying filtering techniques.

[1]  Masashi Inoue,et al.  Collecting humorous expressions from a community-based question-answering-service corpus , 2012, LREC.

[2]  C. Lamb Personality correlates of humor enjoyment following motivational arousal. , 1968, Journal of personality and social psychology.

[3]  Bernhard Schölkopf,et al.  Estimating a Kernel Fisher Discriminant in the Presence of Label Noise , 2001, ICML.

[4]  Chin-Laung Lei,et al.  A crowdsourceable QoE evaluation framework for multimedia content , 2009, ACM Multimedia.

[5]  D. Busby,et al.  Perceived match or mismatch on the Gottman conflict styles: associations with relationship outcome variables. , 2009, Family process.

[6]  Carlo Strapparava,et al.  Making Computers Laugh: Investigations in Automatic Humor Recognition , 2005, HLT.

[7]  Benno Stein,et al.  Evaluating Humour Features on Web Comments , 2010, LREC.

[8]  Rod A. Martin,et al.  Situational Humor Response Questionnaire: Quantitative measure of sense of humor. , 1984 .

[9]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[10]  Janyce Wiebe,et al.  Development and Use of a Gold-Standard Data Set for Subjectivity Classifications , 1999, ACL.

[11]  Rada Mihalcea,et al.  Characterizing Humour: An Exploration of Features in Humorous Texts , 2009, CICLing.

[12]  Eyal Beigman,et al.  Analyzing Disagreements , 2008, COLING 2008.

[13]  R. L. Winkler,et al.  Unanimity and compromise among probability forecasters , 1990 .

[14]  Davide Buscaldi,et al.  From humor recognition to irony detection: The figurative language of social media , 2012, Data Knowl. Eng..

[15]  Edward Gibson,et al.  Using Mechanical Turk to Obtain and Analyze English Acceptability Judgments , 2011, Lang. Linguistics Compass.

[16]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[17]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[18]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[19]  S. Attardo Linguistic theories of humor , 1994 .