Precision Telemedicine through Crowdsourced Machine Learning: Testing Variability of Crowd Workers for Video-Based Autism Feature Recognition

Mobilized telemedicine is becoming a key, and even necessary, facet of both precision health and precision medicine. In this study, we evaluate the capability and potential of a crowd of virtual workers—defined as vetted members of popular crowdsourcing platforms—to aid in the task of diagnosing autism. We evaluate workers when crowdsourcing the task of providing categorical ordinal behavioral ratings to unstructured public YouTube videos of children with autism and neurotypical controls. To evaluate emerging patterns that are consistent across independent crowds, we target workers from distinct geographic loci on two crowdsourcing platforms: an international group of workers on Amazon Mechanical Turk (MTurk) (N = 15) and Microworkers from Bangladesh (N = 56), Kenya (N = 23), and the Philippines (N = 25). We feed worker responses as input to a validated diagnostic machine learning classifier trained on clinician-filled electronic health records. We find that regardless of crowd platform or targeted country, workers vary in the average confidence of the correct diagnosis predicted by the classifier. The best worker responses produce a mean probability of the correct class above 80% and over one standard deviation above 50%, accuracy and variability on par with experts according to prior studies. There is a weak correlation between mean time spent on task and mean performance (r = 0.358, p = 0.005). These results demonstrate that while the crowd can produce accurate diagnoses, there are intrinsic differences in crowdworker ability to rate behavioral features. We propose a novel strategy for recruitment of crowdsourced workers to ensure high quality diagnostic evaluations of autism, and potentially many other pediatric behavioral health conditions. Our approach represents a viable step in the direction of crowd-based approaches for more scalable and affordable precision medicine.

[1]  Peter Washington,et al.  A Gamified Mobile System for Crowdsourcing Video for Autism Research , 2018, 2018 IEEE International Conference on Healthcare Informatics (ICHI).

[2]  Panagiotis G. Ipeirotis,et al.  Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[3]  Oleg S. Pianykh,et al.  Current Applications and Future Impact of Machine Learning in Radiology. , 2018, Radiology.

[4]  Peter Washington,et al.  Detecting Developmental Delay and Autism Through Machine Learning Models Using Home Videos of Bangladeshi Children: Development and Validation Study , 2019, Journal of medical Internet research.

[5]  Sebastien Levy,et al.  Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism , 2017, Molecular Autism.

[6]  Klaas J Wardenaar,et al.  HowNutsAreTheDutch (HoeGekIsNL): A crowdsourcing study of mental symptoms and strengths. , 2016, International journal of methods in psychiatric research.

[7]  Peter Washington,et al.  Guess What? , 2018, Journal of Healthcare Informatics Research.

[8]  C. Lord,et al.  Austism diagnostic observation schedule: A standardized observation of communicative and social behavior , 1989, Journal of autism and developmental disorders.

[9]  Devansh Saxena,et al.  Confronting Autism in Urban Bangladesh: Unpacking Infrastructural and Cultural Challenges , 2018, EAI Endorsed Trans. Pervasive Health Technol..

[10]  Rishab Gargeya,et al.  Automated Identification of Diabetic Retinopathy Using Deep Learning. , 2017, Ophthalmology.

[11]  Yoshiki B. Kurata,et al.  Awetism : A User Ergonomic Learning Management System Intended for Autism Diagnosed Students in the Philippines , 2018 .

[12]  Johnny L. Matson,et al.  The increasing prevalence of autism spectrum disorders , 2011 .

[13]  Dennis P. Wall,et al.  Multi-modular AI Approach to Streamline Autism Diagnosis in Young Children , 2020, Scientific Reports.

[14]  Dennis R. Dixon,et al.  Adaptive Behavior Scales , 2007 .

[15]  Sven Bölte,et al.  The objectivity of the Autism Diagnostic Observation Schedule (ADOS) in naturalistic clinical settings , 2016, European Child & Adolescent Psychiatry.

[16]  Eric Fombonne,et al.  Editorial: The rising prevalence of autism. , 2018, Journal of child psychology and psychiatry, and allied disciplines.

[17]  Peter Washington,et al.  Labeling images with facial emotion and the potential for pediatric healthcare , 2019, Artif. Intell. Medicine.

[18]  A. Couteur,et al.  Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders , 1994, Journal of autism and developmental disorders.

[19]  Badariah Solemon,et al.  An evaluative study on mobile crowdsourcing applications for crime watch , 2014, Proceedings of the 6th International Conference on Information Technology and Multimedia.

[20]  A. Thallaj,et al.  Guess what? , 2011, Saudi journal of anaesthesia.

[21]  Vikas Sindhwani,et al.  Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria , 2009, HLT-NAACL 2009.

[22]  Yi-Shin Chen,et al.  Subconscious Crowdsourcing: A feasible data collection mechanism for mental disorder detection on social media , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[23]  Pietro Perona,et al.  Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[24]  Ingmar Weber,et al.  Crowdsourcing Health Labels: Inferring Body Weight from Profile Pictures , 2016, Digital Health.

[25]  Peter F. Liddle,et al.  Clinical Utility of Machine-Learning Approaches in Schizophrenia: Improving Diagnostic Confidence for Translational Neuroimaging , 2013, Front. Psychiatry.

[26]  Nhatvi Nguyen,et al.  Microworkers Crowdsourcing Approach, Challenges and Solutions , 2014, CrowdMM '14.

[27]  Alice M. Brawley,et al.  Work experiences on MTurk: Job satisfaction, turnover, and information sharing , 2016, Comput. Hum. Behav..

[28]  Björn W. Schuller,et al.  Personalized machine learning for robot perception of affect and engagement in autism therapy , 2018, Science Robotics.

[29]  Peter Washington,et al.  A Mobile Game for Automatic Emotion-Labeling of Images , 2020, IEEE Transactions on Games.

[30]  Peter Washington,et al.  Mobile detection of autism through machine learning on home video: A development and prospective validation study , 2018, PLoS medicine.

[31]  D. Wall,et al.  Identification and Quantification of Gaps in Access to Autism Resources in the United States: An Infodemiological Study , 2019, Journal of medical Internet research.

[32]  Stephanie Cacioppo,et al.  Measuring the Prevalence of Problematic Respondent Behaviors among MTurk, Campus, and Community Participants , 2016, PloS one.

[33]  Michel Dumontier,et al.  Ranking Adverse Drug Reactions With Crowdsourcing , 2015, Journal of medical Internet research.

[34]  Isaac S Kohane,et al.  Artificial Intelligence in Healthcare , 2019, Artificial Intelligence and Machine Learning for Business for Non-Engineers.

[35]  Haik Kalantarian,et al.  Feature Selection and Dimension Reduction of Social Autism Data , 2019, PSB.

[36]  Enayetur Raheem,et al.  Managing autism spectrum disorder in developing countries by utilizing existing resources: A perspective from Bangladesh , 2019, Autism : the international journal of research and practice.

[37]  D. Wall,et al.  Searching for a minimal set of behaviors for autism detection through feature selection-based machine learning , 2015, Translational Psychiatry.

[38]  Emily M. Lund Social Communication Questionnaire , 2014 .

[39]  C. Mazefsky,et al.  The discriminative ability and diagnostic utility of the ADOS-G, ADI-R, and GARS for children in a clinical setting , 2006, Autism : the international journal of research and practice.

[40]  Peter Washington,et al.  SuperpowerGlass , 2017, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[41]  Thanassis Tiropanis,et al.  Crime applications and social machines: crowdsourcing sensitive data , 2013, WWW.

[42]  Georgina Peacock,et al.  Whittling Down the Wait Time: Exploring Models to Minimize the Delay from Initial Concern to Diagnosis and Treatment of Autism Spectrum Disorder. , 2016, Pediatric clinics of North America.

[43]  Phuoc Tran-Gia,et al.  Anatomy of a Crowdsourcing Platform - Using the Example of Microworkers.com , 2011, 2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[44]  Peter Washington,et al.  Exploratory study examining the at-home feasibility of a wearable tool for social-affective learning in children with autism , 2018, npj Digital Medicine.

[45]  C. Newton,et al.  Challenges and coping strategies of parents of children with autism on the Kenyan coast. , 2016, Rural and remote health.

[46]  Christoph Lofi,et al.  Crowdsourcing Twitter annotations to identify first-hand experiences of prescription drug use , 2015, J. Biomed. Informatics.

[47]  Peter Washington,et al.  Effect of Wearable Digital Intervention for Improving Socialization in Children With Autism Spectrum Disorder: A Randomized Clinical Trial , 2019, JAMA pediatrics.

[48]  Christopher Williams Crowdsourcing Research: A Methodology for Investigating State Crime , 2013 .

[49]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[50]  Peter Washington,et al.  Validity of Online Screening for Autism: Crowdsourcing Study Comparing Paid and Unpaid Diagnostic Tasks , 2019, Journal of medical Internet research.

[51]  Ron Dumont,et al.  Oral and Written Language Scales , 2008 .

[52]  Tobias Hoßfeld,et al.  Microworkers vs. facebook: The impact of crowdsourcing platform choice on experimental results , 2012, 2012 Fourth International Workshop on Quality of Multimedia Experience.

[53]  Haik Kalantarian,et al.  Data-Driven Diagnostics and the Potential of Mobile Artificial Intelligence for Digital Therapeutic Phenotyping in Computational Psychiatry. , 2019, Biological psychiatry. Cognitive neuroscience and neuroimaging.

[54]  Todd F. DeLuca,et al.  Use of machine learning to shorten observation-based screening and diagnosis of autism , 2012, Translational Psychiatry.

[55]  Agnieszka Landowska,et al.  Automatic recognition of therapy progress among children with autism , 2017, Scientific Reports.

[56]  Guillermo Sapiro,et al.  Automatic emotion and attention analysis of young children at home: a ResearchKit autism feasibility study , 2018, npj Digital Medicine.

[57]  Peter Washington,et al.  The Performance of Emotion Classifiers for Children With Parent-Reported Autism: Quantitative Feasibility Study , 2020, JMIR mental health.