PACE: Prediction-based Annotation for Crowded Environments

We present a new tool we have developed to ease the annotation of crowded environments, typical of visual surveillance datasets. Our tool is developed using HTML5 and Javascript and has two back-ends. A PHP based back-end implement the persistence using a relational database and manage the dynamic creation of pages and the authentication procedure. A python based REST server implement all the computer vision facilities to assist annotators. Our tool allows collaborative annotation of person identity, group membership, location, gaze and occluded parts. PACE supports multiple cameras and if calibration is provided the geometry is used to improve computer vision based assistance. We detail the whole interface comprising an administrative view that ease the setup of the system.

[1]  Jin Hyeong Park,et al.  Performance evaluation of object detection algorithms , 2002, Object recognition supported by user interaction for service robots.

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Alberto Del Bimbo,et al.  WATTS: a Web Annotation Tool for Surveillance Scenarios , 2015, ACM Multimedia.

[4]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Silvio Savarese,et al.  A Unified Framework for Multi-target Tracking and Collective Activity Recognition , 2012, ECCV.

[7]  Mohamed R. Amer,et al.  HiRF: Hierarchical Random Field for Collective Activity Recognition in Videos , 2014, ECCV.

[8]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[9]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[10]  Alberto Del Bimbo,et al.  User interest profiling using tracking-free coarse gaze estimation , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[11]  Kenneth Y. Goldberg,et al.  Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation , 2012, 2012 American Control Conference (ACC).

[12]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Alberto Del Bimbo,et al.  MuseumVisitors: A dataset for pedestrian and group detection, gaze estimation and behavior understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Deva Ramanan,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[17]  Vittorio Murino,et al.  Decentralized particle filter for joint individual-group tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.