MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
暂无分享,去创建一个
Anima Anandkumar | Linxi (Jim) Fan | Yuke Zhu | De-An Huang | Ajay Mandlekar | Guanzhi Wang | Yunfan Jiang | Yuncong Yang | Haoyi Zhu | Andrew Tang | Yuke Zhu
[1] S. Gu,et al. Large Language Models are Zero-Shot Reasoners , 2022, ArXiv.
[2] Pierre-Luc Bacon,et al. The Primacy Bias in Deep Reinforcement Learning , 2022, ICML.
[3] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[4] S. Levine,et al. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances , 2022, CoRL.
[5] Vikash Kumar,et al. R3M: A Universal Visual Representation for Robot Manipulation , 2022, ArXiv.
[6] Li Fei-Fei,et al. MetaMorph: Learning Universal Controllers with Transformers , 2022, International Conference on Learning Representations.
[7] Amy Zhang,et al. Online Decision Transformer , 2022, ICML.
[8] A. Torralba,et al. Pre-Trained Language Models for Interactive Decision-Making , 2022, NeurIPS.
[9] S. Gu,et al. Can Wikipedia Help Offline Reinforcement Learning? , 2022, ArXiv.
[10] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] James M. Rehg,et al. Ego4D: Around the World in 3,000 Hours of Egocentric Video , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Danijar Hafner. Benchmarking the Spectrum of Agent Capabilities , 2021, ICLR.
[13] Jeff Clune,et al. Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multiobjective Evolutionary Algorithm , 2018, Evolutionary Computation.
[14] Jenia Jitsev,et al. LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs , 2021, ArXiv.
[15] Alexey Skrynnik,et al. NeurIPS 2021 Competition IGLU: Interactive Grounded Language Understanding in a Collaborative Environment , 2021, ArXiv.
[16] Peng Gao,et al. CLIP-Adapter: Better Vision-Language Models with Feature Adapters , 2021, Int. J. Comput. Vis..
[17] Dmytro Okhonko,et al. VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding , 2021, EMNLP.
[18] Dieter Fox,et al. CLIPort: What and Where Pathways for Robotic Manipulation , 2021, CoRL.
[19] S. Savarese,et al. Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation , 2021, CoRL.
[20] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[21] Silvio Savarese,et al. BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments , 2021, CoRL.
[22] Pieter Abbeel,et al. The MineRL BASALT Competition on Learning from Human Feedback , 2021, ArXiv.
[23] Jeff Clune,et al. Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft , 2021, ArXiv.
[24] Bhargava Urala Kota,et al. DocFormer: End-to-End Transformer for Document Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Li Fei-Fei,et al. SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies , 2021, ICML.
[26] Jonathan Tompson,et al. XIRL: Cross-embodiment Inverse Reinforcement Learning , 2021, CoRL.
[27] Sergey Levine,et al. Offline Reinforcement Learning as One Big Sequence Modeling Problem , 2021, NeurIPS.
[28] Pieter Abbeel,et al. Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.
[29] Doina Precup,et al. AndroidEnv: A Reinforcement Learning Platform for Android , 2021, ArXiv.
[30] Shih-Fu Chang,et al. VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text , 2021, NeurIPS.
[31] Nan Duan,et al. CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval , 2021, Neurocomputing.
[32] Chelsea Finn,et al. Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos , 2021, Robotics: Science and Systems.
[33] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[34] Charles Foster,et al. The Pile: An 800GB Dataset of Diverse Text for Language Modeling , 2020, ArXiv.
[35] Cha Zhang,et al. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding , 2020, ACL.
[36] Rasmus Berg Palm,et al. EvoCraft: A New Challenge for Open-Endedness , 2020, EvoApplications.
[37] Lyne P. Tchapmi,et al. iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[38] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[39] Rami Ben-Ari,et al. Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning , 2020, AAAI.
[40] Max Jaderberg,et al. Open-Ended Learning Leads to Generally Capable Agents , 2021, ArXiv.
[41] Felix Hill,et al. Imitating Interactive Intelligence , 2020, ArXiv.
[42] Yejin Choi,et al. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.
[43] Jeannette Bohg,et al. Concept2Robot: Learning manipulation concepts from instructions and human demonstrations , 2020, Robotics: Science and Systems.
[44] Andrew Zisserman,et al. Self-Supervised MultiModal Versatile Networks , 2020, NeurIPS.
[45] Edward Grefenstette,et al. The NetHack Learning Environment , 2020, NeurIPS.
[46] Yi Yang,et al. ActBERT: Learning Global-Local Video-Text Representations , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[48] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[49] Jingkang Wang,et al. BabyAI++: Towards Grounded-Language Learning beyond Memorization , 2020, ArXiv.
[50] Joel Lehman,et al. Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions , 2020, ICML.
[51] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[52] Noam Shazeer,et al. GLU Variants Improve Transformer , 2020, ArXiv.
[53] Furu Wei,et al. LayoutLM: Pre-training of Text and Layout for Document Image Understanding , 2019, KDD.
[54] Andrew Zisserman,et al. End-to-End Learning of Visual Representations From Uncurated Instructional Videos , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Pieter Abbeel,et al. AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos , 2019, Robotics: Science and Systems.
[56] Luke Zettlemoyer,et al. ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Juan Carlos Niebles,et al. RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition , 2020, ECCV.
[59] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[60] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[61] S. Levine,et al. RoboNet: Large-Scale Multi-Robot Learning , 2019, Conference on Robot Learning.
[62] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[63] Cordelia Schmid,et al. Learning Video Representations using Contrastive Bidirectional Transformer , 2019 .
[64] M. Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[65] Ruslan Salakhutdinov,et al. MineRL: A Large-Scale Dataset of Minecraft Demonstrations , 2019, IJCAI.
[66] Ivan Laptev,et al. HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[67] Peter Stone,et al. Recent Advances in Imitation Learning from Observation , 2019, IJCAI.
[68] Katja Hofmann,et al. The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors , 2019, ArXiv.
[69] Matthew Henderson,et al. A Repository of Conversational Datasets , 2019, Proceedings of the First Workshop on NLP for Conversational AI.
[70] Jitendra Malik,et al. Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[71] Julian Togelius,et al. Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning , 2019, IJCAI.
[72] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[73] Rui Wang,et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.
[74] Gunhee Kim,et al. Abstractive Summarization of Reddit Posts with Multi-level Memory Networks , 2018, NAACL.
[75] Thien Huu Nguyen,et al. BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning , 2018, ICLR.
[76] Yannick Schroecker,et al. Imitating Latent Policies from Observation , 2018, ICML.
[77] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[78] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[79] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[80] Satinder Singh,et al. Self-Imitation Learning , 2018, ICML.
[81] Sanja Fidler,et al. VirtualHome: Simulating Household Activities Via Programs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[82] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[83] Ray Kurzweil,et al. Learning Semantic Textual Similarity from Conversations , 2018, Rep4NLP@ACL.
[84] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.
[85] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[86] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[87] Ali Farhadi,et al. AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.
[88] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[89] Benno Stein,et al. TL;DR: Mining Reddit to Learn Automatic Summarization , 2017, NFiS@EMNLP.
[90] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[91] Percy Liang,et al. World of Bits: An Open-Domain Platform for Web-Based Agents , 2017, ICML.
[92] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[93] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[94] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[95] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[96] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[97] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[98] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[99] Kenneth O. Stanley,et al. Open-Ended Evolution: Perspectives from the OEE Workshop in York , 2016, Artificial Life.
[100] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.
[101] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[102] Javier Snaider,et al. Conversational Contextual Cues: The Case of Personalization and History for Response Ranking , 2016, ArXiv.
[103] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[104] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[105] William B. Langdon,et al. Pfeiffer - A Distributed Open-ended Evolutionary System , 2005 .
[106] Russell K. Standish,et al. Open-Ended Artificial Evolution , 2002, Int. J. Comput. Intell. Appl..