Crowdsourced Semantic Matching of Multi-Label Annotations

Most multi-label domains lack an authoritative taxonomy. Therefore, different taxonomies are commonly used in the same domain, which results in complications. Although this situation occurs frequently, there has been little study of it using a principled statistical approach. Given that (1) different taxonomies used in the same domain are generally founded on the same latent semantic space, where each possible label set in a taxonomy denotes a single semantic concept, and that (2) crowdsourcing is beneficial in identifying relationships between semantic concepts and instances at low cost, we proposed a novel probabilistic cascaded method for establishing a semantic matching function in a crowdsourcing setting that maps label sets in one (source) taxonomy to label sets in another (target) taxonomy in terms of the semantic distances between them. The established function can be used to detect the associated label set in the target taxonomy for an instance directly from its associated label set in the source taxonomy without any extra effort. Experimental results on real-world data (emotion annotations for narrative sentences) demonstrated that the proposed method can robustly establish semantic matching functions exhibiting satisfactory performance from a limited number of crowdsourced annotations.

[1]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[2]  J. Russell,et al.  The psychology of facial expression: Frontmatter , 1997 .

[3]  Kenji Araki,et al.  Affect analysis in context of characters in narratives , 2013, Expert Syst. Appl..

[4]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[5]  D. Watson,et al.  On the Dimensional and Hierarchical Structure of Affect , 1999 .

[6]  J. Russell,et al.  The psychology of facial expression: Foreword , 1997 .

[7]  D. A. Bell,et al.  Applied Statistics , 1953, Nature.

[8]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[9]  A. Tellegen,et al.  in Psychological Science , 1996 .

[10]  Lei Duan,et al.  Separate or joint? Estimation of multiple labels from crowdsourced annotations , 2014, Expert Syst. Appl..

[11]  P. Ekman An argument for basic emotions , 1992 .

[12]  Angela Repanovici,et al.  Expert Systems with Applications in the Legal Domain , 2015 .

[13]  James A. Russell,et al.  Anger and Disgust: Discrete or Overlapping Categories? , 2004 .

[14]  Rafael A. Calvo,et al.  Affect Detection: An Interdisciplinary Review of Models, Methods, and Their Applications , 2010, IEEE Transactions on Affective Computing.

[15]  Irfan A. Essa,et al.  Beyond Sentiment: The Manifold of Human Emotions , 2012, AISTATS.

[16]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.