A Statistical Approach to Calibrating the Scores of Biased Reviewers : The Linear vs . the Nonlinear Model 1

Two methods are proposed for aggregating the scores of reviewers in a peer-reviewing rating system. Both methods a re of a statistical nature. The simpler method, which is based on a c lassi al statistical approach from the field of linear models, uses th analysis of variance and can thus be realized by means of existing stat istic l software. The more advanced method, which is a slight modific ation of the method proposed by Roos et al. [13], uses a nonlinear mo del and numerical optimization based on a least-squares approa ch. Under reasonable statistical assumptions, both approaches— t linear and the nonlinear one—can be seen as using the maximum likeli hood principle. Application of either method implies also a n evaluation of the reviewers. An application example with real con ference data shows the power of the statistical methods, compared wi th the common naive approach of simply taking the average scores.