Bias and Controversy in Evaluation Systems

Evaluation is prevalent in real life. With the advent of Web 2.0, online evaluation has become an important feature in many applications that involve information (e.g., video, photo, and audio) sharing and social networking (e.g., blogging). In these evaluation settings, a set of reviewers assign scores to a set of objects. As part of the evaluation analysis, we want to obtain fair reviews for all the given objects. However, the reality is that reviewers may deviate in their scores assigned to the same object, due to the potential "bias" of reviewers or "controversy" of objects. The statistical approach of averaging deviations to determine bias and controversy assumes that all reviewers and objects should be given equal weight. In this paper, we look beyond this assumption and propose an approach based on the following observations: 1) evaluation is "subjective," as reviewers and objects have varying bias and controversy, respectively, and 2) bias and controversy are mutually dependent. These observations underlie our proposed reinforcement-based model to determine bias and controversy simultaneously. Our approach also quantifies "evidence," which reveals the degree of confidence with which bias and controversy have been derived. This model is shown to be effective by experiments on real-life and synthetic data sets.

[1]  Martin G. Everett,et al.  Network analysis of 2-mode data , 1997 .

[2]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[3]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[4]  Joachim Henkel,et al.  What you are is what you like—similarity biases in venture capitalists' evaluations of start-up teams , 2006 .

[5]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[6]  Tanya Y. Berger-Wolf,et al.  A framework for community identification in dynamic social networks , 2007, KDD '07.

[7]  Katherine Faust Centrality in affiliation networks , 1997 .

[8]  Jie Zhang,et al.  Trusting advice from other buyers in e-marketplaces: the problem of unfair ratings , 2006, ICEC '06.

[9]  Ke Wang,et al.  Summarizing Review Scores of "Unequal" Reviewers , 2007, SDM.

[10]  Paul Resnick,et al.  Reputation systems , 2000, CACM.

[11]  Allan Borodin,et al.  Link analysis ranking: algorithms, theory, and experiments , 2005, TOIT.

[12]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[13]  James A. Hendler,et al.  Inferring binary trust relationships in Web-based social networks , 2006, TOIT.

[14]  Henry MacKay Walker,et al.  Variability of referees' ratings of conference papers , 2002, ITiCSE '02.

[15]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[16]  Jaswinder Pal Singh,et al.  Computing and using reputations for internet ratings , 2001, EC '01.

[17]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[18]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[19]  Robert Wilensky,et al.  An algorithm for automated rating of reviewers , 2001, JCDL '01.

[20]  A. Mccook Is Peer Review Broken , 2006 .

[21]  Ke Wang,et al.  Item selection by "hub-authority" profit ranking , 2002, KDD.

[22]  Jianyong Wang,et al.  Out-of-core coherent closed quasi-clique mining from large dense graph databases , 2007, TODS.

[23]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[24]  Jiming Liu,et al.  Community Mining from Signed Social Networks , 2007, IEEE Transactions on Knowledge and Data Engineering.

[25]  Frank Moers,et al.  Discretion and Bias in Performance Evaluation: The Impact of Diversity and Subjectivity , 2001 .

[26]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[27]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[28]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[29]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[30]  H. G. Moore,et al.  Elementary linear algebra with applications , 1980 .

[31]  Chrysanthos Dellarocas,et al.  Mechanisms for coping with unfair ratings and discriminatory behavior in online reputation reporting systems , 2000, ICIS.

[32]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[33]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[34]  Jakob Nielsen,et al.  Automating the assignment of submitted manuscripts to reviewers , 1992, SIGIR '92.

[35]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[36]  Michael J. Pazzani,et al.  Mining for proposal reviewers: lessons learned at the national science foundation , 2006, KDD '06.

[37]  Jon A. Preston Evaluation software: improving consistency and reliability of performance rating , 1997, ITiCSE-WGR '97.

[38]  A. Greenwald,et al.  Grading leniency is a removable contaminant of student ratings. , 1997, The American psychologist.

[39]  M. Hakel,et al.  An Examination of Sources of Peer-Review Bias , 2006, Psychological science.

[40]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[41]  Ashish Goel,et al.  Avoiding ballot stuffing in eBay-like reputation systems , 2005, P2PECON '05.

[42]  Jimeng Sun,et al.  Neighborhood formation and anomaly detection in bipartite graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[43]  Prabhakar Raghavan,et al.  A Linear Method for Deviation Detection in Large Databases , 1996, KDD.

[44]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.