Process Progress Estimation and Phase Detection

Process modeling and understanding is fundamental for advanced human-computer interfaces and automation systems. Recent research focused on activity recognition, but little work has focused on process progress detection from sensor data. We introduce a real-time, sensor-based system for modeling, recognizing and estimating the completeness of a process. We implemented a multimodal CNN-LSTM structure to extract the spatio-temporal features from different sensory datatypes. We used a novel deep regression structure for overall completeness estimation. By combining process completeness estimation with a Gaussian mixture model, our system can predict the process phase using the estimated completeness. We also introduce the rectified hyperbolic tangent (rtanh) activation function and conditional loss to help the training process. Using the completeness estimation result and performance speed calculations, we also implemented an online estimator of remaining time. We tested this system using data obtained from a medical process (trauma resuscitation) and sport events (swim competition). Our system outperformed existing implementations for phase prediction during trauma resuscitation and achieved over 80% of process phase detection accuracy with less than 9% completeness estimation error and time remaining estimation error less than 18% of duration in both dataset.

[1]  Ivan Marsic,et al.  Deep Learning for RFID-Based Activity Recognition , 2016, SenSys.

[2]  Nassir Navab,et al.  The TUM LapChole dataset for the M2CAI 2016 workflow challenge , 2016, ArXiv.

[3]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4]  Nassir Navab,et al.  Modeling and Online Recognition of Surgical Phases Using Hidden Markov Models , 2008, MICCAI.

[5]  Tara N. Sainath,et al.  Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Nassir Navab,et al.  Statistical modeling and recognition of surgical workflow , 2012, Medical Image Anal..

[7]  Jakob E. Bardram,et al.  Phase recognition during surgical procedures using embedded and body-worn sensors , 2011, 2011 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Germain Forestier,et al.  Automatic phase prediction from low-level surgical activities , 2015, International Journal of Computer Assisted Radiology and Surgery.

[10]  Gregory D. Hager,et al.  Learning convolutional action primitives for fine-grained action recognition , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[12]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[13]  Thomas Serre,et al.  An end-to-end generative framework for video segmentation and recognition , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[16]  Klaus Schöffmann,et al.  Temporal segmentation of laparoscopic videos into surgical phases , 2016, 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI).

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Ivan Marsic,et al.  Online process phase detection using multimodal deep learning , 2016, 2016 IEEE 7th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON).

[19]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Nassir Navab,et al.  Random Forests for Phase Detection in Surgical Workflow Analysis , 2014, IPCAI.

[21]  Andrey Dimitrov,et al.  Vision-based material recognition for automated monitoring of construction progress and generating building information modeling from unordered site image collections , 2014, Adv. Eng. Informatics.

[22]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[23]  Stephen J. McKenna,et al.  Combining embedded accelerometers with computer vision for recognizing food preparation activities , 2013, UbiComp.