Mass Collaboration Systems on the World-Wide Web

Mass collaboration systems enlist a multitude of humans to help solve a wide variety of problems. Over the past decade, numerous such systems have appeared on the World-Wide Web. Prime examples include Wikipedia, Linux, Yahoo! Answers, Amazon’s Mechanical Turk, and much effort is being directed at developing many more. As is typical for an emerging area, this effort has appeared under many names, including peer production, user-powered systems, user-generated content, collaborative systems, community systems, social systems, social search, social media, collective intelligence, wikinomics, crowd wisdom, smart mobs, crowd-sourcing, and human computation. The topic has been discussed extensively in books, popular press, and academia (e.g., [31, 32, 25, 2, 36, 17, 3, 7]). But this body of work has considered mostly efforts in the physical world (e.g., [31, 32, 25]). Some do consider mass collaboration systems on the Web, but only certain system types (e.g., [34, 30]) or challenges (e.g., how to evaluate users [14]). This survey attempts to provide a global picture of mass collaboration systems on the Web. We define and classify such systems, then describe a broad sample of systems. The sample ranges from relatively simple wellestablished systems such as reviewing books to complex emerging systems that build structured knowledge bases to systems that “piggy back” on other popular systems. We then discuss fundamental challenges such as how to recruit and evaluate users, and to merge their contributions. Finally, we discuss future directions. Given the space limitation, we do not attempt to be exhaustive. Rather, we sketch only the most important aspects of the global picture, using real-world examples. The goal is to further our collective understanding – both conceptual and practical – of this important emerging topic.

[1]  Oren Etzioni,et al.  Mangrove: Enticing Ordinary People onto the Semantic Web via Instant Gratification , 2003, SEMWEB.

[2]  Ariel Fuxman,et al.  Using the wisdom of the crowds for keyword generation , 2008, WWW.

[3]  Mark A. Musen,et al.  Collecting Community-Based Mappings in an Ontology Repository , 2008, SEMWEB.

[4]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[5]  AnHai Doan,et al.  Matching Schemas in Online Communities: A Web 2.0 Approach , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[6]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[7]  Gerhard Weikum,et al.  The YAGO-NAGA approach to knowledge discovery , 2009, SGMD.

[8]  Matthew Richardson,et al.  Building large knowledge bases by mass collaboration , 2003, K-CAP '03.

[9]  Harith Alani,et al.  The CKC Challenge: Exploring Tools for Collaborative Knowledge Construction , 2008, IEEE Intelligent Systems.

[10]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[11]  Jennifer Golbeck,et al.  Computing and Applying Trust in Web-based Social Networks , 2005 .

[12]  Zachary G. Ives,et al.  ORCHESTRA: Rapid, Collaborative Sharing of Dynamic Data , 2005, CIDR.

[13]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[14]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[15]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[16]  Jeffrey F. Naughton,et al.  Efficiently incorporating user feedback into information extraction and integration programs , 2009, SIGMOD Conference.

[17]  Rada Mihalcea,et al.  Building sense tagged corpora with volunteer contributions over the Web , 2003, RANLP.

[18]  Georgia Koutrika,et al.  CourseRank: A Closed-Community Social System through the Magnifying Glass , 2009, ICWSM.

[19]  Xiaojin Zhu,et al.  Building Community Wikipedias: A Machine-Human Partnership Approach , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[20]  Amit Marathe,et al.  Mass Collaboration: A Case Study , 2004, IDEAS.

[21]  Michael Olson The amateur search , 2008, SGMD.

[22]  R. Stoecker Smart Mobs , 2004 .

[23]  David G. Stork,et al.  Using Open Data Collection for Intelligent Software , 2000, Computer.

[24]  Oren Etzioni,et al.  Adaptive Web sites , 2000, CACM.

[25]  Laura A. Dabbish,et al.  Designing games with a purpose , 2008, CACM.

[26]  James Fogarty,et al.  Intelligence in Wikipedia , 2008, AAAI.