论文信息 - Addressing Malicious Noise in Clickthrough Data

Addressing Malicious Noise in Clickthrough Data

Clickthrough logs are becoming an increasingly used source of training data for learning ranking functions. Due to the large impact that the position in search results has on commercial websites, malicious noise is bound to appear in search engine click logs. We present preliminary work in addressing this form of noise, that we term click-spam. We analyze click-spam from a utility standpoint, and investigate the idea of whether personalizing web search results by partitioning the user population can reduce or eliminate the financial incentives for potential spammers. We formalize click-spam and analyze the incentives for malicious agents, then investigate the model with some examples.

Filip Radlinski | Filip Radlinski

[1] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.

[2] Hongyuan Zha,et al. A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[3] Divyakant Agrawal,et al. Detectives: detecting coalition hit inflation attacks in advertising networks streams , 2007, WWW '07.

[4] Filip Radlinski,et al. Minimally Invasive Randomization for Collecting Unbiased Preferences from Clickthrough Logs , 2006, AAAI 2006.

[5] Filip Radlinski,et al. Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[6] Luca Becchetti,et al. Link-Based Characterization and Detection of Web Spam , 2006, AIRWeb.

[7] Susan T. Dumais,et al. Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[8] Jaime Teevan,et al. Implicit feedback for inferring user preference: a bibliography , 2003, SIGF.

[9] Bernard J. Jansen. Adversarial Information Retrieval Aspects of Sponsored Search , 2006, AIRWeb.

[10] Filip Radlinski,et al. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[11] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.

[12] Susan T. Dumais,et al. Beyond the Commons: Investigating the Value of Personalizing Web Search , 2005 .