论文信息 - Categorization of computing education resources with utilization of crowdsourcing

Categorization of computing education resources with utilization of crowdsourcing

The Ensemble Portal harvests resources from multiple heterogeneous federated collections. Managing these dynamically increasing collections requires an automatic mechanism to categorize records in to corresponding topics. We propose an approach to use existing ACM DL metadata to build classifiers for harvested resources in the Ensemble project. We also present our experience with utilizing the Amazon Mechanical Turk platform to build ground truth training data sets from Ensemble collections.

Edward A. Fox | Lillian N. Cassel | Hao-wei Hsieh | Yinlin Chen | Paul Logasa Bogen

[1] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.

[2] Guocai Chen,et al. Semantic Space models for classification of consumer webpages on metadata attributes , 2010, J. Biomed. Informatics.

[3] Siddharth Suri,et al. Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[4] Jenny Chen,et al. Opportunities for Crowdsourcing Research on Amazon Mechanical Turk , 2011 .

[5] Fei Xia,et al. Preliminary Experiments with Amazon’s Mechanical Turk for Annotating Medical Named Entities , 2010, Mturk@HLT-NAACL.

[6] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.

[7] Aniket Kittur,et al. Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[8] Ralf Steinmetz,et al. Using community-generated contents as a substitute corpus for metadata generation , 2008 .