论文信息 - A Method for Detecting Interaction between 3D Hands and Unknown Objects in RGB Video

A Method for Detecting Interaction between 3D Hands and Unknown Objects in RGB Video

We propose a model that can extract 3D position of hand and object in per-frame of RGB videos through a single feed-forward neural network and a zero-shot learning classifier, and understand unknown hand-object interactions in the entire video through an interactive temporal module. The process is trained end-to-end, without depth images or annotated coordinates as input, which has good application prospects in real life.

[1] Cordelia Schmid,et al. Learning Joint Reconstruction of Hands and Manipulated Objects , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Marc Pollefeys,et al. H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Pavlo Molchanov,et al. Hand Pose Estimation via Latent 2.5D Heatmap Regression , 2018, ECCV.

[4] Michael S. Bernstein,et al. Visual Relationship Detection with Language Priors , 2016, ECCV.

[5] Shanxin Yuan,et al. First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[7] Sang Ho Yoon,et al. Robust Hand Pose Estimation during the Interaction with an Unknown Object , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8] Luc Van Gool,et al. European conference on computer vision (ECCV) , 2006, eccv 2006.