暂无分享,去创建一个
We release two artificial datasets, Simulated Flying Shapes and Simulated Planar Manipulator that allow to test the learning ability of video processing systems. In particular, the dataset is meant as a tool which allows to easily assess the sanity of deep neural network models that aim to encode, reconstruct or predict video frame sequences. The datasets each consist of 90000 videos. The Simulated Flying Shapes dataset comprises scenes showing two objects of equal shape (rectangle, triangle and circle) and size in which one object approaches its counterpart. The Simulated Planar Manipulator shows a 3-DOF planar manipulator that executes a pick-and-place task in which it has to place a size-varying circle on a squared platform. Different from other widely used datasets such as moving MNIST [1], [2], the two presented datasets involve goal-oriented tasks (e.g. the manipulator grasping an object and placing it on a platform), rather than showing random movements. This makes our datasets more suitable for testing prediction capabilities and the learning of sophisticated motions by a machine learning model. This technical document aims at providing an introduction into the usage of both datasets.
[1] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[2] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.
[3] Eren Erdal Aksoy,et al. Deep Episodic Memory: Encoding, Recalling, and Predicting Episodic Experiences for Robot Action Execution , 2018, IEEE Robotics and Automation Letters.