Incremental Learning for Animal Pose Estimation using RBF k-DPP

Pose estimation is the task of locating keypoints for an object of interest in an image. Animal Pose estimation is more challenging than estimating human pose due to high inter and intra class variability in animals. Existing works solve this problem for a fixed set of predefined animal categories. Models trained on such sets usually do not work well with new animal categories. Retraining the model on new categories makes the model overfit and leads to catastrophic forgetting. Thus, in this work, we propose a novel problem of “Incremental Learning for Animal Pose Estimation”. Our method uses an exemplar memory, sampled using Determinantal Point Processes (DPP) to continually adapt to new animal categories without forgetting the old ones. We further propose a new variant of k-DPP that uses RBF kernel (termed as “RBF k-DPP”) which gives more gain in performance over traditional k-DPP. Due to memory constraints, the limited number of exemplars along with new class data can lead to class imbalance. We mitigate it by performing image warping as an augmentation technique. This helps in crafting diverse poses, which reduces overfitting and yields further improvement in performance. The efficacy of our proposed approach is demonstrated via extensive experiments and ablations where we obtain significant improvements over state-of-the-art baseline methods.

[1]  N. Venkata Sailaja,et al.  Incremental learning for text categorization using rough set boundary based optimized Support Vector Neural Network , 2020, Data Technol. Appl..

[2]  Jacob M. Graving,et al.  DeepPoseKit, a software toolkit for fast and robust animal pose estimation using deep learning , 2019, bioRxiv.

[3]  David Picard,et al.  2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Cewu Lu,et al.  RMPE: Regional Multi-person Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Cordelia Schmid,et al.  Incremental Learning of Object Detectors without Catastrophic Forgetting , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Oliver Kroemer,et al.  Camera-to-Robot Pose Estimation from a Single Image , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[8]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[9]  Kevin M. Cury,et al.  DeepLabCut: markerless pose estimation of user-defined body parts with deep learning , 2018, Nature Neuroscience.

[10]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[11]  Kristen Grauman,et al.  Diverse Sequential Subset Selection for Supervised Video Summarization , 2014, NIPS.

[12]  Mikhail Kislin,et al.  Fast animal pose estimation using deep neural networks , 2018, Nature Methods.

[13]  Fred L. Bookstein,et al.  Principal Warps: Thin-Plate Splines and the Decomposition of Deformations , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[15]  Yi Lu Murphey,et al.  Incremental Learning for Text Document Classification , 2007, 2007 International Joint Conference on Neural Networks.

[16]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[17]  Thomas Bak,et al.  Pose estimation and adaptive robot behaviour for human-robot interaction , 2009, 2009 IEEE International Conference on Robotics and Automation.

[18]  Hanlin Tang,et al.  ATRW: A Benchmark for Amur Tiger Re-identification in the Wild , 2019, ACM Multimedia.

[19]  Mackenzie W. Mathis,et al.  Deep learning tools for the measurement of animal behavior in neuroscience , 2019, Current Opinion in Neurobiology.

[20]  Gim Hee Lee,et al.  From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yichen Wei,et al.  Simple Baselines for Human Pose Estimation and Tracking , 2018, ECCV.

[22]  Larry P. Heck,et al.  Class-incremental Learning via Deep Model Consolidation , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[25]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Simone Calderara,et al.  Dark Experience for General Continual Learning: a Strong, Simple Baseline , 2020, NeurIPS.

[27]  Brian C. Lovell,et al.  Omni-supervised joint detection and pose estimation for wild animals , 2020, Pattern Recognit. Lett..

[28]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[29]  Ben Taskar,et al.  Structured Determinantal Point Processes , 2010, NIPS.

[30]  Weichao Qiu,et al.  Learning From Synthetic Animals , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Bingni W. Brunton,et al.  A TOOLKIT FOR ROBUST MARKERLESS 3D POSE ESTIMATION , 2021 .

[32]  Pascal Fua,et al.  Deformation-Aware Unpaired Image Translation for Pose Estimation on Laboratory Animals , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Ben Taskar,et al.  k-DPPs: Fixed-Size Determinantal Point Processes , 2011, ICML.

[34]  Andrea Vedaldi,et al.  Transferring Dense Pose to Proximal Animal Classes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jia Li,et al.  Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation , 2019, AAAI.

[36]  Li Yang,et al.  Learn#: A Novel incremental learning method for text classification , 2020, Expert Syst. Appl..

[37]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Joachim Denzler,et al.  Active and Incremental Learning with Weak Supervision , 2020, KI - Künstliche Intelligenz.

[39]  Pascal Fua,et al.  DeepFly3D: A deep learning-based approach for 3D limb and appendage tracking in tethered, adult Drosophila , 2019, bioRxiv.

[40]  Cewu Lu,et al.  Cross-Domain Adaptation for Animal Pose Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Alexandru Telea,et al.  An Image Inpainting Technique Based on the Fast Marching Method , 2004, J. Graphics, GPU, & Game Tools.

[42]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[43]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Alexander Toet,et al.  Applications of digital image warping in surveillance and navigation , 1998 .

[45]  Maurice Queyranne,et al.  An Exact Algorithm for Maximum Entropy Sampling , 1995, Oper. Res..

[46]  Michael J. Black,et al.  Lions and Tigers and Bears: Capturing Non-rigid, 3D, Articulated Shape from Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Philip H. S. Torr,et al.  GDumb: A Simple Approach that Questions Our Progress in Continual Learning , 2020, ECCV.

[48]  Sami Sieranoja,et al.  How much can k-means be improved by using better initialization and repeats? , 2019, Pattern Recognit..

[49]  Andrew W. Fitzgibbon,et al.  Creatures great and SMAL: Recovering the shape and motion of animals from video , 2018, ACCV.

[50]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  D. King Applications for real time image warping , 1996, Southcon/96 Conference Record.

[53]  D. T. Lee,et al.  Two algorithms for constructing a Delaunay triangulation , 1980, International Journal of Computer & Information Sciences.

[54]  Samuel B. Williams,et al.  ASSOCIATION FOR COMPUTING MACHINERY , 2000 .

[55]  Farhood Negin,et al.  An efficient human action recognition framework with pose-based spatiotemporal features , 2020, Engineering Science and Technology, an International Journal.

[56]  A. Soshnikov,et al.  Janossy Densities. I. Determinantal Ensembles , 2002, math-ph/0212063.

[57]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[59]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..