MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
暂无分享,去创建一个
Anima Anandkumar | Linxi (Jim) Fan | Yuke Zhu | De-An Huang | Ajay Mandlekar | Guanzhi Wang | Yunfan Jiang | Yuncong Yang | Haoyi Zhu | Andrew Tang
[1] Li Fei-Fei,et al. VIMA: General Robot Manipulation with Multimodal Prompts , 2022, ArXiv.
[2] Peter R. Florence,et al. Inner Monologue: Embodied Reasoning through Planning with Language Models , 2022, CoRL.
[3] J. Clune,et al. Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos , 2022, NeurIPS.
[4] Aniruddha Kembhavi,et al. Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks , 2022, ICLR.
[5] S. Gu,et al. Large Language Models are Zero-Shot Reasoners , 2022, NeurIPS.
[6] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[7] Pierre-Luc Bacon,et al. The Primacy Bias in Deep Reinforcement Learning , 2022, ICML.
[8] Sergio Gomez Colmenarejo,et al. A Generalist Agent , 2022, Trans. Mach. Learn. Res..
[9] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[10] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[11] S. Levine,et al. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances , 2022, CoRL.
[12] Adrian S. Wong,et al. Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language , 2022, ICLR.
[13] Vikash Kumar,et al. R3M: A Universal Visual Representation for Robot Manipulation , 2022, CoRL.
[14] Li Fei-Fei,et al. MetaMorph: Learning Universal Controllers with Transformers , 2022, International Conference on Learning Representations.
[15] Amy Zhang,et al. Online Decision Transformer , 2022, ICML.
[16] A. Torralba,et al. Pre-Trained Language Models for Interactive Decision-Making , 2022, NeurIPS.
[17] S. Gu,et al. Can Wikipedia Help Offline Reinforcement Learning? , 2022, ArXiv.
[18] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Jenia Jitsev,et al. LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs , 2021, ArXiv.
[20] James M. Rehg,et al. Ego4D: Around the World in 3,000 Hours of Egocentric Video , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Alexey Skrynnik,et al. NeurIPS 2021 Competition IGLU: Interactive Grounded Language Understanding in a Collaborative Environment , 2021, ArXiv.
[22] Peng Gao,et al. CLIP-Adapter: Better Vision-Language Models with Feature Adapters , 2021, Int. J. Comput. Vis..
[23] Dmytro Okhonko,et al. VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding , 2021, EMNLP.
[24] Dieter Fox,et al. CLIPort: What and Where Pathways for Robotic Manipulation , 2021, CoRL.
[25] Danijar Hafner. Benchmarking the Spectrum of Agent Capabilities , 2021, ICLR.
[26] S. Savarese,et al. Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation , 2021, CoRL.
[27] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[28] Silvio Savarese,et al. BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments , 2021, CoRL.
[29] Pieter Abbeel,et al. The MineRL BASALT Competition on Learning from Human Feedback , 2021, ArXiv.
[30] Jeff Clune,et al. Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft , 2021, ArXiv.
[31] Bhargava Urala Kota,et al. DocFormer: End-to-End Transformer for Document Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[32] Li Fei-Fei,et al. SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies , 2021, ICML.
[33] Jonathan Tompson,et al. XIRL: Cross-embodiment Inverse Reinforcement Learning , 2021, CoRL.
[34] Sergey Levine,et al. Offline Reinforcement Learning as One Big Sequence Modeling Problem , 2021, NeurIPS.
[35] Pieter Abbeel,et al. Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.
[36] Doina Precup,et al. AndroidEnv: A Reinforcement Learning Platform for Android , 2021, ArXiv.
[37] Shih-Fu Chang,et al. VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text , 2021, NeurIPS.
[38] Nan Duan,et al. CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval , 2021, Neurocomputing.
[39] Chelsea Finn,et al. Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos , 2021, Robotics: Science and Systems.
[40] Lambert Schomaker,et al. Self-Imitation Learning by Planning , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[41] Cheston Tan,et al. A Survey of Embodied AI: From Simulators to Research Tasks , 2021, IEEE Transactions on Emerging Topics in Computational Intelligence.
[42] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[43] Diego Perez Liebana,et al. The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors , 2021, ArXiv.
[44] Charles Foster,et al. The Pile: An 800GB Dataset of Diverse Text for Language Modeling , 2020, ArXiv.
[45] Cha Zhang,et al. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding , 2020, ACL.
[46] Felix Hill,et al. Imitating Interactive Intelligence , 2020, ArXiv.
[47] Rasmus Berg Palm,et al. EvoCraft: A New Challenge for Open-Endedness , 2020, EvoApplications.
[48] Lyne P. Tchapmi,et al. iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[49] Natasha Jaques,et al. Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design , 2020, NeurIPS.
[50] Roozbeh Mottaghi,et al. Rearrangement: A Challenge for Embodied AI , 2020, ArXiv.
[51] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[52] Yejin Choi,et al. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.
[53] Jeannette Bohg,et al. Concept2Robot: Learning manipulation concepts from instructions and human demonstrations , 2020, Robotics: Science and Systems.
[54] Andrew Zisserman,et al. Self-Supervised MultiModal Versatile Networks , 2020, NeurIPS.
[55] Alexander Toshev,et al. ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects , 2020, ArXiv.
[56] Edward Grefenstette,et al. The NetHack Learning Environment , 2020, NeurIPS.
[57] Yi Yang,et al. ActBERT: Learning Global-Local Video-Text Representations , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[59] Sen Wu,et al. Understanding and Improving Information Transfer in Multi-Task Learning , 2020, ICLR.
[60] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[61] Jingkang Wang,et al. BabyAI++: Towards Grounded-Language Learning beyond Memorization , 2020, ArXiv.
[62] Joel Lehman,et al. Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions , 2020, ICML.
[63] Rami Ben-Ari,et al. Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning , 2020, AAAI.
[64] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[65] Noam Shazeer,et al. GLU Variants Improve Transformer , 2020, ArXiv.
[66] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[67] Furu Wei,et al. LayoutLM: Pre-training of Text and Layout for Document Image Understanding , 2019, KDD.
[68] Andrew Zisserman,et al. End-to-End Learning of Visual Representations From Uncurated Instructional Videos , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[69] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[70] Pieter Abbeel,et al. AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos , 2019, Robotics: Science and Systems.
[71] Luke Zettlemoyer,et al. ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[72] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[73] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] S. Levine,et al. RoboNet: Large-Scale Multi-Robot Learning , 2019, CoRL.
[75] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[76] Cordelia Schmid,et al. Learning Video Representations using Contrastive Bidirectional Transformer , 2019 .
[77] M. Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[78] Ruslan Salakhutdinov,et al. MineRL: A Large-Scale Dataset of Minecraft Demonstrations , 2019, IJCAI.
[79] Ivan Laptev,et al. HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[80] Peter Stone,et al. Recent Advances in Imitation Learning from Observation , 2019, IJCAI.
[81] Matthew Henderson,et al. A Repository of Conversational Datasets , 2019, Proceedings of the First Workshop on NLP for Conversational AI.
[82] Jitendra Malik,et al. Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[83] J. Togelius,et al. Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning , 2019, IJCAI.
[84] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[85] Sam Devlin,et al. The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition , 2019, ArXiv.
[86] Rui Wang,et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.
[87] Gunhee Kim,et al. Abstractive Summarization of Reddit Posts with Multi-level Memory Networks , 2018, NAACL.
[88] Thien Huu Nguyen,et al. BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning , 2018, ICLR.
[89] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[90] Jeff Clune,et al. Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multiobjective Evolutionary Algorithm , 2018, Evolutionary Computation.
[91] Sanja Fidler,et al. VirtualHome: Simulating Household Activities Via Programs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[92] P. Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[93] Yannick Schroecker,et al. Imitating Latent Policies from Observation , 2018, ICML.
[94] Ray Kurzweil,et al. Learning Semantic Textual Similarity from Conversations , 2018, Rep4NLP@ACL.
[95] Ali Farhadi,et al. AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.
[96] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.
[97] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[98] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[99] Benno Stein,et al. TL;DR: Mining Reddit to Learn Automatic Summarization , 2017, NFiS@EMNLP.
[100] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[101] Percy Liang,et al. World of Bits: An Open-Domain Platform for Web-Based Agents , 2017, ICML.
[102] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[103] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[104] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[105] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[106] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[107] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[108] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[109] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[110] Kenneth O. Stanley,et al. Open-Ended Evolution: Perspectives from the OEE Workshop in York , 2016, Artificial Life.
[111] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.
[112] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[113] Javier Snaider,et al. Conversational Contextual Cues: The Case of Personalization and History for Response Ranking , 2016, ArXiv.
[114] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[115] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[116] Russell K. Standish,et al. Open-Ended Artificial Evolution , 2002, Int. J. Comput. Intell. Appl..
[117] Max Jaderberg,et al. Open-Ended Learning Leads to Generally Capable Agents , 2021, ArXiv.
[118] David Howard,et al. A Review of Physics Simulators for Robotic Applications , 2021, IEEE Access.
[119] Juan Carlos Niebles,et al. RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition , 2020, ECCV.
[120] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[121] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[122] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[123] William B. Langdon,et al. Pfeiffer - A Distributed Open-ended Evolutionary System , 2005 .
[124] Sonia Chernova,et al. Recent Advances in Robot Learning from Demonstration , 2020, Annu. Rev. Control. Robotics Auton. Syst..