ScoreFinder: A method for collaborative quality inference on user-generated content

User-generated content is quickly becoming the greatest source of information on the World Wide Web. Shared content items are initially considered unconfirmed in the sense that their credibility has not yet been established. Conventional, centralized confirmation of credibility is infeasible at the Internet scale and so making use of the annotators themselves to evaluate each item is essential. However, users usually differ in opinions to the same item, and the existence of bias, variance and maliciousness makes the problem of aggregating opinions more difficult. Addressing this problem, we propose the use of an Author-Annotator model with an iterative algorithm, called ScoreFinder, for inferring credibility by ranking shared items. In order to reduce the influence from a variety of error sources, we identify reliable users on each topic, and adaptively aggregate scores from them. Moreover, we transform the users' input to remove errors/anomalies, by identifying patterns of misbehaviour learned from a real data set. We show how our algorithm performs on both real data sets and synthetic data sets, and a significant improvement was achieved in the experiment.