ROAR: Robust Label Ranking for Social Emotion Mining

Understanding and predicting latent emotions of users toward online contents, known as social emotion mining, has became increasingly important to both social platforms and businesses alike. Despite recent developments, however, very little attention has been made to the issues of nuance, subjectivity, and bias of social emotions. In this paper, we fill this gap by formulating social emotion mining as a robust label ranking problem, and propose: (1) a robust measure, named as G-mean-rank (GMR), which sets a formal criterion consistent with practical intuition; and (2) a simple yet effective label ranking model, named as ROAR, that is more robust toward unbalanced datasets (which are common). Through comprehensive empirical validation using 4 real datasets and 16 benchmark semi-synthetic label ranking datasets, and a case study, we demonstrate the superiorities of our proposals over 2 popular label ranking measures and 6 competing label ranking algorithms. The datasets and implementations used in the empirical validation are available for access.

[1]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[2]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[3]  C. L. Mallows NON-NULL RANKING MODELS. I , 1957 .

[4]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5]  R. Graham,et al.  Spearman's Footrule as a Measure of Disarray , 1977 .

[6]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[7]  Tingshao Zhu,et al.  Predicting Reader's Emotion on Chinese Web News Articles , 2012, ICPCA/SWS.

[8]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[9]  Eyke Hüllermeier,et al.  Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows , 2014, ICML.

[10]  Sébastien Destercke,et al.  A Pairwise Label Ranking Method with Imprecise Scores and Partial Predictions , 2013, ECML/PKDD.

[11]  Philip L. H. Yu,et al.  Distance-based tree models for ranking data , 2010, Comput. Stat. Data Anal..

[12]  Eyke Hüllermeier,et al.  Decision tree and instance-based learning for label ranking , 2009, ICML '09.

[13]  Slobodan Vucetic,et al.  Multi-Prototype Label Ranking with Novel Pairwise-to-Total-Rank Aggregation , 2013, IJCAI.

[14]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[15]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[16]  Eyke Hüllermeier,et al.  Labelwise versus Pairwise Decomposition in Label Ranking , 2013, LWA.

[17]  Yoram Singer,et al.  Log-Linear Models for Label Ranking , 2003, NIPS.

[18]  Ning Zhang,et al.  Cross-domain and cross-category emotion tagging for comments of online news , 2014, SIGIR.

[19]  Xiao Zhi Gao,et al.  A label ranking method based on Gaussian mixture model , 2014, Knowl. Based Syst..

[20]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[21]  Grace S. Shieh A weighted Kendall's tau statistic , 1998 .

[22]  Yang Wang,et al.  Boosting for Learning Multiple Classes with Imbalanced Class Distribution , 2006, Sixth International Conference on Data Mining (ICDM'06).

[23]  Hsin-Hsi Chen,et al.  Emotion Classification of Online News Articles from the Reader's Perspective , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[24]  Eyke Hüllermeier,et al.  Predicting Partial Orders: Ranking with Abstention , 2010, ECML/PKDD.

[25]  Wenyin Liu,et al.  Towards building a social emotion detection system for online news , 2014, Future Gener. Comput. Syst..

[26]  Hsin-Hsi Chen,et al.  Ranking Reader Emotions Using Pairwise Loss Minimization and Emotional Distribution Regression , 2008, EMNLP.

[27]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[28]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[29]  Dan Roth,et al.  Constraint Classification for Multiclass Classification and Ranking , 2002, NIPS.

[30]  Eyke Hüllermeier,et al.  Label Ranking with Partial Abstention based on Thresholded Probabilistic Models , 2012, NIPS.

[31]  Philip L. H. Yu,et al.  Author's Personal Copy Computational Statistics and Data Analysis Mixtures of Weighted Distance-based Models for Ranking Data with Applications in Political Studies , 2022 .

[32]  Hsin-Hsi Chen,et al.  Emotion Modeling from Writer/Reader Perspectives Using a Microblog Dataset , 2011 .

[33]  Eyke Hüllermeier,et al.  Label Ranking Methods based on the Plackett-Luce Model , 2010, ICML.

[34]  Shiwen Yu,et al.  Reader emotion classification of news headlines , 2009, 2009 International Conference on Natural Language Processing and Knowledge Engineering.

[35]  Enhong Chen,et al.  Tracking the Evolution of Social Emotions: A Time-Aware Topic Modeling Perspective , 2014, 2014 IEEE International Conference on Data Mining.