LegoTron: An Environment for Interactive Structural Understanding

Visual reasoning about geometric structures with detailed spatial relationships is a fundamental component of human intelligence. As children, we learn how to reason about this structure not only from observation, but also by interacting with the world around us – by taking things apart and putting them back together again. We introduce a new learning environment designed to explore the interplay between interactive reasoning, scene understanding and construction by mining a previously untapped highquality data source: fan-made Lego1 creations that have been uploaded to the internet. To make use of this data we have built LegoTron, a fully interactive 3D environment that allows a learning agent to assemble, disassemble and manipulate these models. Our goal is to provide an interactive playground for agents to explore and manipulate complex scenes and recover their underlying structure.

[1]  J. Zhou,et al.  Automatic Generation of Vivid LEGO Architectural Sculptures , 2019, Comput. Graph. Forum.

[2]  Jae Woo Kim,et al.  Survey on Automated LEGO Assembly Construction , 2014 .

[3]  Marc Schoenauer,et al.  Blindbuilder: A New Encoding to Evolve Lego-Like Structures , 2006, EuroGP.

[4]  William C. Regli,et al.  Using assembly representations to enable evolutionary design of Lego structures , 2003, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[5]  Lyne P. Tchapmi,et al.  iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Angel X. Chang,et al.  MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation , 2020, NeurIPS.

[7]  Ali Farhadi,et al.  RoboTHOR: An Open Simulation-to-Real Embodied AI Platform , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jitendra Malik,et al.  Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Niko Sünderhauf,et al.  The Robotic Vision Scene Understanding Challenge , 2020, ArXiv.

[10]  Rylee Thompson,et al.  Building LEGO Using Deep Generative Models of Graphs , 2020, ArXiv.

[11]  Mark Pauly,et al.  Automatic Generation of Constructable Brick Sculptures , 2013, Eurographics.

[12]  Luke Zettlemoyer,et al.  ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Chun-Kai Huang,et al.  Legolization: optimizing LEGO designs , 2015, ACM Trans. Graph..