Detecting Promotion Campaigns in Community Question Answering

With Community Question Answering (CQA) evolving into a quite popular method for information seeking and providing, it also becomes a target for spammers to disseminate promotion campaigns. Although there are a number of quality estimation efforts on the CQA platform, most of these works focus on identifying and reducing low-quality answers, which are mostly generated by impatient or inexperienced answerers. However, a large number of promotion answers appear to provide high-quality information to cheat CQA users in future interactions. Therefore, most existing quality estimation works in CQA may fail to detect these specially designed answers or question-answer pairs. In contrast to these works, we focus on the promotion channels of spammers, which include (shortened) URLs, telephone numbers and social media accounts. Spammers rely on these channels to connect to users to achieve promotion goals so they are irreplaceable for spamming activities. We propose a propagation algorithm to diffuse promotion intents on an "answerer-channel" bipartite graph and detect possible spamming activities. A supervised learning framework is also proposed to identify whether a QA pair is spam based on propagated promotion intents. Experimental results based on more than 6 million entries from a popular Chinese CQA portal show that our approach outperforms a number of existing quality estimation methods for detecting promotion campaigns on both the answer level and QA pair level.

[1]  Ee-Peng Lim,et al.  Quality-aware collaborative question answering: methods and evaluation , 2009, WSDM '09.

[2]  Gang Wang,et al.  Man vs. Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers , 2014, USENIX Security Symposium.

[3]  Tong Zhang,et al.  Crowd Fraud Detection in Internet Advertising , 2015, WWW.

[4]  Srinivasan Venkatesh,et al.  The best answers? Think twice: Online detection of commercial campaigns in the CQA forums , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[5]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[6]  Bin Wang,et al.  Learning to rank for question routing in community question answering , 2013, CIKM.

[7]  Yiqun Liu,et al.  Search engine click spam detection based on bipartite graph propagation , 2014, WSDM.

[8]  Xuanjing Huang,et al.  Detecting Spammers in Community Question Answering , 2013, IJCNLP.

[9]  Sheizaf Rafaeli,et al.  Predictors of answer quality in online Q&A sites , 2008, CHI.

[10]  Michael R. Lyu,et al.  Analyzing and predicting question quality in community question answering services , 2012, WWW.

[11]  Enhong Chen,et al.  Improving search relevance for short queries in community question answering , 2014, WSDM.

[12]  Noriko Kando,et al.  Using graded-relevance metrics for evaluating community QA answer selection , 2011, WSDM '11.

[13]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[14]  Gang Wang,et al.  Serf and turf: crowdturfing for fun and profit , 2011, WWW.

[15]  Evgeniy Gabrilovich,et al.  Predicting web searcher satisfaction with existing community-based answers , 2011, SIGIR.

[16]  Yiqun Liu,et al.  Fraudulent Support Telephone Number Identification Based on Co-Occurrence Information on the Web , 2014, AAAI.

[17]  Jun Zhao,et al.  Joint relevance and answer quality learning for question routing in community QA , 2012, CIKM.

[18]  W. Bruce Croft,et al.  A framework to predict the quality of answers with non-textual features , 2006, SIGIR.

[19]  Eugene Agichtein,et al.  Modeling information-seeker satisfaction in community question answering , 2009, TKDD.

[20]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[21]  Zhoujun Li,et al.  Question Retrieval with High Quality Answers in Community Question Answering , 2014, CIKM.