Customized crowds and active learning to improve classification

Abstract Traditional classification algorithms can be limited in their performance when a specific user is targeted. User preferences, e.g. in recommendation systems, constitute a challenge for learning algorithms. Additionally, in recent years user’s interaction through crowdsourcing has drawn significant interest, although its use in learning settings is still underused. In this work we focus on an active strategy that uses crowd-based non-expert information to appropriately tackle the problem of capturing the drift between user preferences in a recommendation system. The proposed method combines two main ideas: to apply active strategies for adaptation to each user; to implement crowdsourcing to avoid excessive user feedback. A similitude technique is put forward to optimize the choice of the more appropriate similitude-wise crowd, under the guidance of basic user feedback. The proposed active learning framework allows non-experts classification performed by crowds to be used to define the user profile, mitigating the labeling effort normally requested to the user. The framework is designed to be generic and suitable to be applied to different scenarios, whilst customizable for each specific user. A case study on humor classification scenario is used to demonstrate experimentally that the approach can improve baseline active results.

[1]  Rob Miller,et al.  Crowdsourced Databases: Query Processing with People , 2011, CIDR.

[2]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3]  Kim Binsted,et al.  An Implemented Model of Punning Riddles , 1994, AAAI.

[4]  Héctor Pomares,et al.  Generating Balanced Learning and Test Sets for Function Approximation Problems , 2011, Int. J. Neural Syst..

[5]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[6]  Jan Marco Leimeister,et al.  Collective Intelligence , 2010, Bus. Inf. Syst. Eng..

[7]  Samuel Greengard,et al.  Following the crowd , 2011, Commun. ACM.

[8]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[9]  Mykola Pechenizkiy,et al.  Dynamic integration of classifiers for handling concept drift , 2008, Inf. Fusion.

[10]  Christophe Charrier,et al.  International Journal of Neural Systems Special Issue on Issue's Topic C World Scientific Publishing Company Tabu Search Model Selection for Svm , 2022 .

[11]  Pietro Perona,et al.  Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[12]  Bernardete Ribeiro,et al.  On Text-based Mining with Active Learning and Background Knowledge Using SVM , 2007, Soft Comput..

[13]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[14]  Sarah Jane Delany,et al.  Using Crowdsourcing for Labelling Emotional Speech Assets , 2010 .

[15]  Carlo Strapparava,et al.  Getting serious about the development of computational humor , 2003, IJCAI 2003.

[16]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[17]  Luis-Felipe Cabrera,et al.  AI Gets a Brain , 2006, ACM Queue.

[18]  Benno Stein,et al.  Evaluating Humour Features on Web Comments , 2010, LREC.

[19]  Derek Greene,et al.  The Interaction Between Supervised Learning and Crowdsourcing , 2010 .

[20]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[21]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[22]  Bernardete Ribeiro,et al.  On using crowdsourcing and active learning to improve classification performance , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[23]  Michael Vitale,et al.  The Wisdom of Crowds , 2015, Cell.

[24]  Robi Polikar,et al.  An Ensemble Approach for Incremental Learning in Nonstationary Environments , 2007, MCS.

[25]  Bernardete Ribeiro,et al.  Towards Expanding Relevance Vector Machines to Large Scale Datasets , 2008, Int. J. Neural Syst..

[26]  Prashant Chaudhary,et al.  An Instance-Window Based Classification Algorithm for Handling Gradual Concept Drifts , 2011, ADMI.

[27]  Matteo Negri,et al.  Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual Textual Entailment Corpora , 2011, EMNLP.

[28]  A. Pentland,et al.  Collective intelligence , 2006, IEEE Comput. Intell. Mag..

[29]  Bernardete Ribeiro,et al.  The Importance of Precision in Humour Classification , 2011, IDEAL.

[30]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[31]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[32]  Jian Su,et al.  Multi-Criteria-based Active Learning for Named Entity Recognition , 2004, ACL.

[33]  Daren C. Brabham Crowdsourcing as a Model for Problem Solving , 2008 .

[34]  Christopher M. Bishop Latent Variable Models , 1998, Learning in Graphical Models.

[35]  Carlo Strapparava,et al.  Technologies That Make You Smile: Adding Humor to Text-Based Applications , 2006, IEEE Intelligent Systems.

[36]  Carlo Strapparava,et al.  Making Computers Laugh: Investigations in Automatic Humor Recognition , 2005, HLT.

[37]  Bernardete Ribeiro,et al.  Get Your Jokes Right: Ask the Crowd , 2011, MEDI.

[38]  Geoffrey I. Webb,et al.  # 2001 Kluwer Academic Publishers. Printed in the Netherlands. Machine Learning for User Modeling , 1999 .

[39]  Hong-Yuan Mark Liao,et al.  Learning facial attributes by crowdsourcing in social media , 2011, WWW.

[40]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[41]  Aniket Kittur,et al.  CrowdForge: crowdsourcing complex work , 2011, UIST.

[42]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[43]  Gita Reese Sukthankar,et al.  Robust Active Learning Using Crowdsourced Annotations for Activity Recognition , 2011, Human Computation.