Robust Active Learning Using Crowdsourced Annotations for Activity Recognition

Recognizing human activities from wearable sensor data is an important problem, particularly for health and eldercare applications. However, collecting sufficient labeled training data is challenging, especially since interpreting IMU traces is difficult for human annotators. Recently, crowdsourcing through services such as Amazon's Mechanical Turk has emerged as a promising alternative for annotating such data, with active learning (Cohn, Ghahramani, and Jordan 1996) serving as a natural method for affordably selecting an appropriate subset of instances to label. Unfortunately, since most active learning strategies are greedy methods that select the most uncertain sample, they are very sensitive to annotation errors (which corrupt a significant fraction of crowdsourced labels). This paper proposes methods for robust active learning under these conditions. Specifically, we make three contributions: 1) we obtain better initial labels by asking labelers to solve a related task; 2) we propose a new principled method for selecting instances in active learning that is more robust to annotation noise; 3) we estimate confidence scores for labels acquired from MTurk and ask workers to relabel samples that receive low scores under this metric. The proposed method is shown to significantly outperform existing techniques both under controlled noise conditions and in real active learning scenarios. The resulting method trains classifiers that are close in accuracy to those trained using ground-truth data.

[1]  Henry A. Kautz,et al.  Sensor-Based Understanding of Daily Life via Large-Scale Use of Common Sense , 2006, AAAI.

[2]  Paul J. M. Havinga,et al.  Activity Recognition Using Inertial Sensing for Healthcare, Wellbeing and Sports Applications: A Survey , 2010, ARCS Workshops.

[3]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[4]  Ling Bao,et al.  Activity Recognition from User-Annotated Acceleration Data , 2004, Pervasive.

[5]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[6]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[7]  Deva Ramanan,et al.  Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces , 2010, ECCV.

[8]  Martial Hebert,et al.  Temporal segmentation and activity classification from first-person sensing , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[9]  James M. Rehg,et al.  A Scalable Approach to Activity Recognition based on Object Use , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Cyrus Rashtchian,et al.  Collecting Image Annotations Using Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.

[11]  Xi Wang,et al.  Recognizing household activities from human motion data using active learning and feature selection , 2010 .

[12]  Jaime G. Carbonell,et al.  Proactive learning: cost-sensitive active learning with multiple imperfect oracles , 2008, CIKM '08.

[13]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[14]  Daphne Koller,et al.  Active Learning for Parameter Estimation in Bayesian Networks , 2000, NIPS.

[15]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[16]  Sanjoy Dasgupta,et al.  Hierarchical sampling for active learning , 2008, ICML '08.

[17]  Prateek Jain,et al.  Far-sighted active learning on a budget for image and video recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Michael L. Littman,et al.  Activity Recognition from Accelerometer Data , 2005, AAAI.