Crowdsourcing for search evaluation
暂无分享,去创建一个
[1] Alexandros Ntoulas,et al. Estimating the Quality of Postings in the Real-time Web , 2010 .
[2] and software — performance evaluation , .
[3] James Allan,et al. Minimal test collections for retrieval evaluation , 2006, SIGIR.
[4] Bernard J. Jansen,et al. Twitter power: Tweets as electronic word of mouth , 2009 .
[5] Cyril Cleverdon,et al. The Cranfield tests on index language devices , 1997 .
[6] Mary Beth Rosson,et al. How and why people Twitter: the role that micro-blogging plays in informal communication at work , 2009, GROUP.
[7] Fernando Diaz,et al. Time is of the essence: improving recency ranking using Twitter data , 2010, WWW '10.
[8] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.
[9] Emine Yilmaz,et al. A statistical method for system evaluation using incomplete judgments , 2006, SIGIR.
[10] Balachander Krishnamurthy,et al. A few chirps about twitter , 2008, WOSN '08.
[11] Hosung Park,et al. What is Twitter, a social network or a news media? , 2010, WWW '10.
[12] Timothy W. Finin,et al. Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.
[13] Yutaka Matsuo,et al. Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.
[14] Mónica Marrero,et al. Crowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks , 2010 .
[15] Elizabeth F. Churchill,et al. Logging the Search Self-Efficacy of Amazon Mechanical Turkers , 2010 .
[16] Ben Carterette,et al. An Analysis of Assessor Behavior in Crowdsourced Preference Judgments , 2010 .
[17] Omar Alonso,et al. Detecting Uninteresting Content in Text Streams , 2010 .
[18] Iadh Ounis,et al. Crowdsourcing a News Query Classification Dataset , 2010 .
[19] Susan T. Dumais,et al. Characterizing Microblogs with Topic Models , 2010, ICWSM.
[20] Mohammad Soleymani,et al. Crowdsourcing for Affective Annotation of Video: Development of a Viewer-reported Boredom Corpus , 2010 .
[21] John Le,et al. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution , 2010 .
[22] Emine Yilmaz,et al. A simple and efficient sampling method for estimating AP and NDCG , 2008, SIGIR '08.