Identifying Spammers to Boost Crowdsourced Classification

The present work addresses the problem of adversarial attacks in unsupervised ensemble or crowdsourcing classification tasks. Under certain conditions, it is shown, both analytically and through numerical tests, that spammers cause the most damage with respect to classification performance. To curb their effect, a novel spectral algorithm for spammer detection that utilizes second-order statistics of annotators, is developed and preliminary results on synthetic and real data showcase the potential of this approach.