BNU-LCSAD: a video database for classroom student action recognition

With the development and application of digital cameras, especially in education, a great number of digital video recordings are produced in classrooms. Taking Beijing Normal University as an example, 3.4 TB of videos are recorded every day in more than 200 classrooms. Such huge data is beneficial for us, computer vision researchers, to automatically recognize students' classroom actions and even evaluate the quality of classroom teaching. To focus action recognition on students, we propose Beijing Normal University Large-scale Classroom Student Action Database version 1.0(BNU-LCSAD) which is the first large-scale classroom student action database for student action recognition and consists of 10 classroom student action classes from digital camera recordings at BNU. We introduce the construct and label Processing of this database in detail. In Addition , we provide baseline of student action recognition results based our new database using C3D network.

[1]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Mubarak Shah,et al.  Recognizing 50 human action categories of web videos , 2012, Machine Vision and Applications.

[4]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[6]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Neural Networks , 2013 .

[8]  Bernard Ghanem,et al.  ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[10]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Juan Carlos Niebles,et al.  Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification , 2010, ECCV.