Catch the Shadow: Person Tracking Under Occlusion with a Single RGB-D Camera

Locomotion in physical space is one of the most natural forms of interaction in applications such as virtual reality systems. Although there are many algorithms to track walking people, existing methods mostly fail to cope with occluded bodies in the setting of multi-person tracking with one camera. This paper proposes a method to overcome this challenge by fusing skeletal with shadow data, both of which are captured by a single RGB-D camera. Skeletal tracking provides the positions of people that can be captured directly, while their shadows are used to track them when they are no longer visible. Our experiments confirm that this method can efficiently handle full occlusions. It thus has substantial value in resolving the occlusion problem in multi-person tracking, even with other kinds of cameras.

[1]  Chenglei Yang,et al.  Supporting Easy Physical-to-Virtual Creation of Mobile VR Maze Games: A New Genre , 2017, CHI.

[2]  Robby van Delden,et al.  A Thing of Beauty: Steering Behavior in an Interactive Playground , 2017, CHI.

[3]  Yasushi Yagi,et al.  Shadow extraction and application in pedestrian detection , 2014, EURASIP J. Image Video Process..

[4]  Nicolai Marquardt,et al.  EagleSense: Tracking People and Devices in Interactive Spaces using Real-Time Top-View Depth-Sensing , 2017, CHI.

[5]  Shwetak N. Patel,et al.  ID-Match: A Hybrid Computer Vision and RFID System for Recognizing Individuals in Groups , 2016, CHI.

[6]  V. Beran,et al.  Depth-Based Filtration for Tracking Boost , 2015, ACIVS.

[7]  Mary C. Whitton,et al.  Walking > walking-in-place > flying, in virtual environments , 1999, SIGGRAPH.

[8]  Krystof Litomisky Consumer RGB-D Cameras and their Applications , 2012 .

[9]  Shin Ishii,et al.  An occlusion-aware particle filter tracker to handle complex and persistent occlusions , 2016, Computer Vision and Image Understanding.

[10]  Wai Shiang Cheah,et al.  Occlusion Handling in Videos Object Tracking: A Survey , 2014 .

[11]  Moira C. Norrie,et al.  XDKinect: development framework for cross-device interaction using kinect , 2014, EICS.

[12]  Chuan Qin,et al.  Can smartphone sensors enhance kinect experience? , 2012, MobiHoc '12.

[13]  Pao-Chi Chang,et al.  People tracking in an environment with multiple depth cameras: A skeleton-based pairwise trajectory matching scheme , 2016, J. Vis. Commun. Image Represent..

[14]  Patrick Baudisch,et al.  VirtualSpace - Overloading Physical Space with Multiple Virtual Reality Users , 2018, CHI.

[15]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.