Dynamic Graph CNN for Event-Camera Based Gesture Recognition

Event camera is a kind of bio-inspired sensor which is able to capture the motion in asynchronous events stream. An event is triggered when the pixel has a brightness change. In spatio-temporal space, those events will form an event cloud, which has specific 3D geometry to capture the dynamic scene. To analyze the event cloud, previous works usually convert event streams into frame-based images which did not fully utilize its 3D geometry in the spatio-temporal event space. In this work, we propose to recognize the spatio-temporal 3D event clouds for gesture recognition using Dynamic Graph CNN (DGCNN) which directly takes 3D points as input and is successfully used for 3D object recognition. We adapt DGCNN to perform action recognition by recognizing 3D geometry features in spatio-temporal space of the event data. We achieve state-of-the-art accuracy of 98.56% on the IBM DVS128 Gesture dataset and 95.94% on the DHP19 dataset.