Recognition Of Atypical Behavior In Autism Diagnosis From Video Using Pose Estimation Over Time

Autism spectrum disorder (ASD), similar to many other developmental or behavioral conditions, is difficult to be precisely diagnosed. This difficulty increases when the subjects are young children due to the huge overlap between the ASD symptoms and typical behaviors of young children. Therefore, it is important to develop reliable methods that could help distinguish ASD from normal behaviors of children. In this paper, we implemented a computer vision based automatic ASD prediction approach to detect autistic characteristics in a video dataset recorded from a mix of children with and without ASD. Our target dataset contains 555 videos, out of which 8349 episodes (each approximately 10 seconds) are derived. Each episode is labeled as an atypical or typical behavior by medical experts. We first estimate children pose in each video frame by re-training a state-of-the-art human pose estimator on our manually annotated children pose dataset. Particle filter interpolation is then applied on the output of the pose estimator to predict the locations of missing body keypoints. For each episode, we calculate the children motion pattern defined as the trajectory of their keypoints over time by temporally encoding the estimated pose maps. Finally, a binary classification network is trained on the pose motion representations to discriminate between typical and atypical behaviors. We were able to achieve a classification accuracy of 72.4% (precision =0.72 and recall =0.92) on our test dataset.

[1]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Lorenzo Torresani,et al.  Detect-and-Track: Efficient Pose Estimation in Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Michael C. Frank,et al.  Vision-Based Classification of Developmental Disorders Using Eye-Movements , 2016, MICCAI.

[4]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Bernt Schiele,et al.  PoseTrack: A Benchmark for Human Pose Estimation and Tracking , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Sarah Ostadabbas,et al.  A Semi-Supervised Data Augmentation Approach using 3D Graphical Engines , 2018, ECCV Workshops.

[7]  Cordelia Schmid,et al.  PoTion: Pose MoTion Representation for Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[9]  Marian Stewart Bartlett,et al.  Emotion Mirror: A Novel Intervention for Autism Based on Real-Time Expression Recognition , 2012, ECCV Workshops.

[10]  Joseph Piven,et al.  Abnormal Use of Facial Information in High-Functioning Autism , 2007, Journal of autism and developmental disorders.

[11]  James M. Rehg Behavior Imaging: Using Computer Vision to Study Autism , 2011, MVA.

[12]  Vassilios Morellas,et al.  A computer vision approach for the assessment of autism-related behavioral markers , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).