论文信息 - Hand-Raising Gesture Detection in Real Classroom

Hand-Raising Gesture Detection in Real Classroom

This paper proposes a novel method for hand-raising detection in the real classroom environment. Different from traditional motion detection, the hand-raising detection is quite challenging in the real classroom due to complex scenarios, various gestures, and low resolutions. To solve these challenges, we first build up a large-scale hand-raising data set from thirty primary schools and middle schools of Shanghai, China. Then we propose an improved R-FCN to solve the above-mentioned challenges. Specifically, we first design an automatic detection templates algorithm for various gestures of hand-raising detection. Second, for better detection of the small-size hands, we present a feature pyramid to simultaneously capture the detail and highly semantic features. Incorporating these two strategies into a basic R-FCN architecture, our model achieves impressive results on real classroom scenarios. After a wide test, the accuracy of the hand-raising detection achieves 85% on average, which can satisfy the real application.

Ruimin Shen | Fei Jiang | Jiaojiao Lin

[1] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[2] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[3] Rainer Lienhart,et al. An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[4] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[5] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[7] Hong Liu,et al. Detection of hand-raising gestures based on body silhouette analysis , 2009, 2008 IEEE International Conference on Robotics and Biomimetics.

[8] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Hassen Drira,et al. Human Object Interaction Recognition Using Rate-Invariant Shape Analysis of Inter Joint Distances Trajectories , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11] Varsha Hemant Patil,et al. A Study of Vision based Human Motion Recognition and Analysis , 2016, Int. J. Ambient Comput. Intell..

[12] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[13] Andrew Hogue,et al. Recognition of Hand Raising Gestures for a Remote Learning Application , 2007, Eighth International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '07).

[14] Bruce H. Thomas,et al. Data fragment: Virtual reality for viewing and querying large image sets , 2017, 2017 IEEE Virtual Reality (VR).

[15] Michael R. M. Jenkin,et al. Recognizing hand-raising gestures using HMM , 2005, The 2nd Canadian Conference on Computer and Robot Vision (CRV'05).

[16] Camille Couprie,et al. Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Thad Starner,et al. Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[18] Dengke Gao,et al. Haar-Feature Based Gesture Detection of Hand-Raising for Mobile Robot in HRI Environments , 2010 .