A smart data annotation tool for multi-sensor activity recognition

Annotation of multimodal data sets is often a time consuming and a challenging task as many approaches require an accurate labeling. This includes in particular video recordings as often labeling exact to a frame is required. For that purpose, we created an annotation tool that enables to annotate data sets of video and inertial sensor data. However, in contrast to the most existing approaches, we focus on semi-supervised labeling support to infer labels for the whole dataset. More precisely, after labeling a small set of instances our system is able to provide labeling recommendations and in turn it makes learning of image features more feasible by speeding up the labeling time for single frames. We aim to rely on the inertial sensors of our wristband to support the labeling of video recordings. For that purpose, we apply template matching in context of dynamic time warping to identify time intervals of certain actions. To investigate the feasibility of our approach we focus on a real world scenario, i.e., we gathered a data set which describes an order picking scenario of a logistic company. In this context, we focus on the picking process as the selection of the correct items can be prone to errors. Preliminary results show that we are able to identify 69% of the grabbing motion periods of time.

[1]  Anna M. Bianchi,et al.  User-Independent Recognition of Sports Activities From a Single Wrist-Worn Accelerometer: A Template-Matching-Based Approach , 2016, IEEE Transactions on Biomedical Engineering.

[2]  Elena Mugellini,et al.  A Smart Watch with Embedded Sensors to Recognize Objects, Grasps and Forearm Gestures , 2012 .

[3]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[4]  Martial Hebert,et al.  Temporal segmentation and activity classification from first-person sensing , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Hironobu Takagi,et al.  Recognizing hand-object interactions in wearable camera videos , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[6]  Daniel Sonntag,et al.  LabelMovie: Semi-supervised machine annotation tool with quality assurance and crowd-sourcing options for videos , 2014, 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  Timo Sztyler,et al.  On-body localization of wearable devices: An investigation of position-aware activity recognition , 2016, 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[8]  Jessica K. Hodgins,et al.  Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database , 2008 .

[9]  Timo Sztyler,et al.  Exploring a multi-sensor picking process in the future warehouse , 2016, UbiComp Adjunct.

[10]  Daniel Sonntag,et al.  Multimodal multisensor activity annotation tool , 2016, UbiComp Adjunct.