论文信息 - Investigating Comparative Evaluation for Large Data

Investigating Comparative Evaluation for Large Data

Evaluation is ubiquitous. Often we need to evaluate a set of target entities and obtain their true ratings (average ratings from the population) or true rankings (rankings derived from true ratings). Based on the law of large numbers, average ratings from large samples can provide a good approximation. However, due to the fact that evaluation is labor-intensive, in practice large evaluation data are typically very sparse where each entity receives very few ratings. Consequently, average ratings would significantly differ from true ratings due to biased distribution of standards and preferences of evaluators. In this paper, we investigate comparative evaluation that addresses the evaluation bias problem for improved evaluation accuracy. The principle idea is to first extract a partial list for the entities evaluated by each evaluator, and then aggregate all the partial lists to obtain a total list that well approximates the true rankings. The aggregated total list can be used to further estimate the true ratings. In this paper we also study an associated problem of evaluation assignment (assigning target entities to evaluators), where we propose an iterative assignment approach to maximize accuracy of comparative evaluation given limited evaluation resources.

Byron J. Gao | Jose Antonio Martinez Torres

[1] Nir Ailon,et al. Aggregation of Partial Rankings, p-Ratings and Top-m Lists , 2007, SODA '07.

[2] Dan Roth,et al. Unsupervised rank aggregation with distance-based models , 2008, ICML '08.

[3] Moni Naor,et al. Rank aggregation methods for the Web , 2001, WWW '01.

[4] Naren Ramakrishnan,et al. Recommender systems for the conference paper assignment problem , 2009, RecSys '09.

[5] Tao Qin,et al. Supervised rank aggregation , 2007, WWW '07.

[6] J. Marden. Analyzing and Modeling Rank Data , 1996 .

[7] Jakob Nielsen,et al. Automating the assignment of submitted manuscripts to reviewers , 1992, SIGIR '92.

[8] Shili Lin,et al. Rank aggregation methods , 2010 .

[9] Andrew McCallum,et al. Expertise modeling for matching papers with reviewers , 2007, KDD '07.