AMASS: Archive of Motion Capture As Surface Shapes

Large datasets are the cornerstone of recent advances in computer vision using deep learning. In contrast, existing human motion capture (mocap) datasets are small and the motions limited, hampering progress on learning models of human motion. While there are many different datasets available, they each use a different parameterization of the body, making it difficult to integrate them into a single meta dataset. To address this, we introduce AMASS, a large and varied database of human motion that unifies 15 different optical marker-based mocap datasets by representing them within a common framework and parameterization. We achieve this using a new method, MoSh++, that converts mocap data into realistic 3D human meshes represented by a rigged body model. Here we use SMPL [Loper et al., 2015], which is widely used and provides a standard skeletal representation as well as a fully rigged surface mesh. The method works for arbitrary marker sets, while recovering soft-tissue dynamics and realistic hand motion. We evaluate MoSh++ and tune its hyperparameters using a new dataset of 4D body scans that are jointly recorded with markerbased mocap. The consistent representation of AMASS makes it readily useful for animation, visualization, and generating training data for deep learning. Our dataset is significantly richer than previous human motion collections, having more than 40 hours of motion data, spanning over 300 subjects, more than 11000 motions, and will be publicly available to the research community.

[1]  Cordelia Schmid,et al.  Learning from Synthetic Humans , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Michael J. Black,et al.  Dyna: a model of dynamic human shape in motion , 2015, ACM Trans. Graph..

[3]  T P Andriacchi,et al.  Studies of human locomotion: past, present and future. , 2000, Journal of biomechanics.

[4]  A. Cappozzo,et al.  Human movement analysis using stereophotogrammetry. Part 3. Soft tissue artifact assessment and compensation. , 2005, Gait & posture.

[5]  Kenrick Kin,et al.  Online optical marker-based hand tracking with deep labels , 2018, ACM Trans. Graph..

[6]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[7]  Nikolaus F. Troje,et al.  Auto-labelling of Markers in Optical Motion Capture by Permutation Learning , 2019, CGI.

[8]  Michael J. Black,et al.  MoSh: motion and shape capture from sparse markers , 2014, ACM Trans. Graph..

[9]  Dimitrios Tzionas,et al.  Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Michael J. Black,et al.  Dynamic FAUST: Registering Human Bodies in Motion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jonathan Maycock,et al.  Fully automatic optical motion tracking using an inverse kinematics approach , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[12]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[13]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[14]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[15]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH '05.

[16]  Andrew W. Fitzgibbon,et al.  Online generative model personalization for hand tracking , 2017, ACM Trans. Graph..

[17]  Ludovic Hoyet,et al.  Sleight of hand: perception of finger motion from reduced marker sets , 2012, I3D '12.

[18]  Andrew W. Fitzgibbon,et al.  Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences , 2016, ACM Trans. Graph..

[19]  George E Gorton,et al.  Assessment of the kinematic variability among 12 motion analysis laboratories. , 2009, Gait & posture.

[20]  Aaron Hertzmann,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2006) Learning a Correlated Model of Identity and Pose-dependent Body Shape Variation for Real-time Synthesis , 2022 .

[21]  Taku Komura,et al.  A Deep Learning Framework for Character Motion Synthesis and Editing , 2016, ACM Trans. Graph..

[22]  Gerhard W. Dueck,et al.  Threshold accepting: a general purpose optimization algorithm appearing superior to simulated anneal , 1990 .

[23]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[24]  Tamim Asfour,et al.  The KIT whole-body human motion database , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[25]  Jessica K. Hodgins,et al.  Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database , 2008 .

[26]  Tido Röder,et al.  Documentation Mocap Database HDM05 , 2007 .

[27]  Michael J. Black,et al.  Coregistration: Simultaneous Alignment and Modeling of Articulated 3D Shape , 2012, ECCV.

[28]  A. Cappozzo,et al.  Human movement analysis using stereophotogrammetry. Part 1: theoretical background. , 2005, Gait & posture.

[29]  Charles Malleson,et al.  Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors , 2017, BMVC.

[30]  Hans-Peter Seidel,et al.  Efficient and Robust Annotation of Motion Capture Data , 2009 .

[31]  Jonathan Maycock,et al.  Reduced marker layouts for optical motion capture of hands , 2015, MIG.

[32]  Christoph von Laßberg,et al.  Neuromuscular onset succession of high level gymnasts during dynamic leg acceleration phases on high bar. , 2013, Journal of electromyography and kinesiology : official journal of the International Society of Electrophysiological Kinesiology.

[33]  P R Cavanagh,et al.  Three-dimensional kinematics of the human knee during walking. , 1992, Journal of biomechanics.

[34]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[35]  N. Troje Decomposing biological motion: a framework for analysis and synthesis of human gait patterns. , 2002, Journal of vision.

[36]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[37]  Jonas Beskow,et al.  Robust online motion capture labeling of finger markers , 2016, MIG.

[38]  Michael J. Black,et al.  Pose-conditioned joint angle limits for 3D human pose reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Daniel Holden,et al.  Robust solving of optical motion capture data by denoising , 2018, ACM Trans. Graph..

[40]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..