The Critique of Crowds: Using Collective Criticism to Crowdsource Subjective Preferences

Crowdsourcing encompasses everything from large collaborative projects to microtasks performed in parallel and at scale. However, understanding subjective preferences can still be difficult: a majority of problems do not have validated questionnaires and pairwise comparisons do not scale, even with access to the crowd. Furthermore, in daily life we are used to expressing opinions as critiques (e.g. it was too cold, too spicy, too big), rather than describing precise preferences or choosing between (perhaps equally bad) discrete options. Unfortunately, it is difficult to analyze such qualitative feedback, especially when we want to make quantitative decisions. In this article, we present collective criticism, a crowdsourcing approach where users provide feedback to microtasks in the form of critiques, such as “it was too easy/too challenging”. This qualitative feedback is used to perform quantitative analysis of users’ preferences and opinions. Collective criticism has several advantages over other approaches: “too much/too little”-style critiques are easy for users to provide and it allows us to build predictive models for the optimal parameterization of the variables being critiqued. We present two case studies where we model: (i) aesthetic preferences in neural style transfer and (ii) hedonic experiences in the video game Tetris. These studies demonstrate the flexibility of our approach, and show that it produces robust results that are straightforward for experimenters to interpret and inline with users’ stated preferences.

[1]  Alexander Y. Lin,et al.  Defining Ideal External Female Genital Anatomy Via Crowdsourcing Analysis. , 2021, Aesthetic surgery journal.

[2]  S. Morrison,et al.  Defining Aesthetic Preferences for the Penis: A Photogrammetric and Crowdsourcing Analysis. , 2021, Aesthetic surgery journal.

[3]  D. Jannach,et al.  A Survey on Conversational Recommender Systems , 2020, ACM Comput. Surv..

[4]  H. Lorenz,et al.  Applied Online Crowdsourcing in Plastic and Reconstructive Surgery: A Comparison of Aesthetic Outcomes in Unilateral Cleft Lip Repair Techniques. , 2020, Annals of plastic surgery.

[5]  Yuan Jin,et al.  A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control , 2018, Artif. Intell..

[6]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7]  Jorge Gonçalves,et al.  Crowdsourcing Perceptions of Fair Predictors for Machine Learning , 2019, Proc. ACM Hum. Comput. Interact..

[8]  Li Chen,et al.  MusicBot: Evaluating Critiquing-Based Music Recommenders with Conversational Interaction , 2019, CIKM.

[9]  Lora Aroyo,et al.  Crowdsourcing Subjective Tasks: The Case Study of Understanding Toxicity in Online Discussions , 2019, WWW.

[10]  Ga Wu,et al.  Deep Language-based Critiquing for Recommender Systems , 2019 .

[11]  J. Henrich,et al.  The Moral Machine experiment , 2018, Nature.

[12]  Daniel J. Gould,et al.  The Public's Perception on Breast and Nipple Reconstruction: A Crowdsourcing-Based Assessment. , 2018, Aesthetic surgery journal.

[13]  Daniel J. Gould,et al.  The Ideal Thigh: A Crowdsourcing-Based Assessment of Ideal Thigh Aesthetic and Implications for Gluteal Fat Grafting , 2018, Aesthetic surgery journal.

[14]  D. Hoang FLOW: The Psychology of Optimal Experience , 2018 .

[15]  Rafal Mantiuk,et al.  A practical guide and software for analysing pairwise comparison experiments , 2017, ArXiv.

[16]  Ron Kohavi,et al.  The Surprising Power of Online Experiments , 2017 .

[17]  Nicolas Villar,et al.  Polygons, points, or voxels?: stimuli selection for crowdsourcing aesthetics preferences of 3D shape pairs , 2017, CAE '17.

[18]  Ron Kohavi,et al.  Online Controlled Experiments and A/B Testing , 2017, Encyclopedia of Machine Learning and Data Mining.

[19]  R. Tse,et al.  Crowdsourcing as a Novel Method to Evaluate Aesthetic Outcomes of Treatment for Unilateral Cleft Lip , 2016, Plastic and reconstructive surgery.

[20]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[21]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[22]  Econo Metrica REGRESSION ANALYSIS WHEN THE DEPENDENT VARIABLE IS TRUNCATED NORMAL , 2016 .

[23]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[24]  Julian Togelius,et al.  Crowdsourcing the Aesthetics of Platform Games , 2013, IEEE Transactions on Computational Intelligence and AI in Games.

[25]  Gitte Lindgaard,et al.  Introduction to the Special Issue: The Tricky Landscape of Developing Rating Scales in HCI , 2013, Interact. Comput..

[26]  Todd M. Gureckis,et al.  CUNY Academic , 2016 .

[27]  Francisco Luis Gutiérrez Vela,et al.  Playability: analysing user experience in video games , 2012, Behav. Inf. Technol..

[28]  Elena Filatova,et al.  Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing , 2012, LREC.

[29]  Li Chen,et al.  Critiquing-based recommenders: survey and emerging trends , 2012, User Modeling and User-Adapted Interaction.

[30]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[31]  Tara S. Behrend,et al.  The viability of crowdsourcing for survey research , 2011, Behavior research methods.

[32]  John Riedl,et al.  Navigating the tag genome , 2011, IUI '11.

[33]  Regina Bernhaupt User Experience Evaluation in Entertainment , 2010, Evaluating User Experience in Games.

[34]  Omar Alonso,et al.  Crowdsourcing for relevance evaluation , 2008, SIGF.

[35]  Steve Swink,et al.  Game Feel: A Game Designer's Guide to Virtual Sensation , 2008 .

[36]  Marc Hassenzahl,et al.  User experience (UX): towards an experiential perspective on product quality , 2008, IHM '08.

[37]  Paolo Viappiani,et al.  Preference-based Search using Example-Critiquing with Suggestions , 2006, J. Artif. Intell. Res..

[38]  Marc Hassenzahl,et al.  User experience - a research agenda , 2006, Behav. Inf. Technol..

[39]  Natasha D. Schüll Digital Gambling: The Coincidence of Desire and Design , 2005 .

[40]  Boi Faltings,et al.  Decision Tradeoff Using Example-Critiquing and Constraint Programming , 2004, Constraints.

[41]  Boi Faltings,et al.  Designing example-critiquing interaction , 2004, IUI '04.

[42]  Robin D. Burke,et al.  Interactive Critiquing forCatalog Navigation in E-Commerce , 2002, Artificial Intelligence Review.

[43]  Kristian J. Hammond,et al.  The FindMe Approach to Assisted Browsing , 1997, IEEE Expert.