Fine Hand Segmentation using Convolutional Neural Networks

We propose a method for extracting very accurate masks of hands in egocentric views. Our method is based on a novel Deep Learning architecture: In contrast with current Deep Learning methods, we do not use upscaling layers applied to a low-dimensional representation of the input image. Instead, we extract features with convolutional layers and map them directly to a segmentation mask with a fully connected layer. We show that this approach, when applied in a multi-scale fashion, is both accurate and efficient enough for real-time. We demonstrate it on a new dataset made of images captured in various environments, from the outdoors to offices.

[1]  Harm de Vries,et al.  RMSProp and equilibrated adaptive learning rates for non-convex optimization. , 2015 .

[2]  Abdul Rahman Ramli,et al.  A hand segmentation scheme using clustering technique in homogeneous background , 2002, Student Conference on Research and Development.

[3]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[4]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Vincent Lepetit,et al.  A semi-automatic method for resolving occlusion in augmented reality , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[7]  M. Iqbal Saripan,et al.  Skin Segmentation Using YUV and RGB Color Spaces , 2014, J. Inf. Process. Syst..

[8]  Jakub Nalepa,et al.  Skin Detection and Segmentation in Color Images , 2014 .

[9]  Andrew I. Comport,et al.  3D High Dynamic Range dense visual SLAM and its application to real-time object re-lighting , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[10]  Luca Benini,et al.  Gesture Recognition in Ego-centric Videos Using Dense Trajectories and Hand Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[11]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[12]  Paul Debevec Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 2008, SIGGRAPH Classes.

[13]  Stefanie Zollmann,et al.  Image-based ghostings for single layer occlusions in augmented reality , 2010, 2010 IEEE International Symposium on Mixed and Augmented Reality.

[14]  Vincent Lepetit,et al.  Stable real-time 3D tracking using online and offline information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Abdesselam Bouzerdoum,et al.  Skin segmentation using color pixel classification: analysis and comparison , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Zhuowen Tu,et al.  Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jinyu Li,et al.  Beyond cross-entropy: towards better frame-level objective functions for deep neural network training in automatic speech recognition , 2014, INTERSPEECH.

[19]  Paul E. Debevec,et al.  Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 1998, SIGGRAPH '08.

[20]  Yu-Jen Huang,et al.  Fusing depth, color, and skeleton data for enhanced real-time hand segmentation , 2013, SUI '13.

[21]  Vincent Lepetit,et al.  Retexturing in the Presence of Complex Illumination and Occlusions , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.