A Generalist Agent
暂无分享,去创建一个
Sergio Gomez Colmenarejo | Jost Tobias Springenberg | Ashley D. Edwards | Yutian Chen | Oriol Vinyals | Yury Sulsky | R. Hadsell | N. Heess | Gabriel Barth-Maron | N. D. Freitas | Yutian Chen | Ali Razavi | Jackie Kay | Tom Eccles | Jake Bruce | Alexander Novikov | Konrad Zolna | Emilio Parisotto | S. Reed | Mai Gimenez | Ashley Edwards | Mahyar Bordbar | Mai Gimenez
[1] Jost Tobias Springenberg,et al. How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[2] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, ArXiv.
[3] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[4] S. Levine,et al. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances , 2022, CoRL.
[5] Lisa Anne Hendricks,et al. Training Compute-Optimal Large Language Models , 2022, ArXiv.
[6] Jacob Menick,et al. Teaching language models to support answers with verified quotes , 2022, ArXiv.
[7] A. Gupta,et al. The Unsurprising Effectiveness of Pre-Trained Vision Models for Control , 2022, ICML.
[8] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[9] Amy Zhang,et al. Online Decision Transformer , 2022, ICML.
[10] Cherepanov,et al. Competition-level code generation with AlphaCode , 2022, Science.
[11] A. Torralba,et al. Pre-Trained Language Models for Interactive Decision-Making , 2022, NeurIPS.
[12] S. Gu,et al. Can Wikipedia Help Offline Reinforcement Learning? , 2022, ArXiv.
[13] Renelito Delos Santos,et al. LaMDA: Language Models for Dialog Applications , 2022, ArXiv.
[14] P. Abbeel,et al. Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents , 2022, ICML.
[15] Diego de Las Casas,et al. Improving language models by retrieving from trillions of tokens , 2021, ICML.
[16] S. Gu,et al. Generalized Decision Transformer for Offline Hindsight Information Matching , 2021, ICLR.
[17] Alexander M. Rush,et al. Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.
[18] Quoc V. Le,et al. Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.
[19] Olivier J. H'enaff,et al. Perceiver IO: A General Architecture for Structured Inputs & Outputs , 2021, ICLR.
[20] WebGPT: Browser-assisted question-answering with human feedback , 2021, ArXiv.
[21] Po-Sen Huang,et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.
[22] Po-Sen Huang,et al. Ethical and social risks of harm from Language Models , 2021, ArXiv.
[23] Nando de Freitas,et al. Shaking the foundations: delusions in sequence models for interaction and control , 2021, ArXiv.
[24] Raia Hadsell,et al. Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes , 2021, CoRL.
[25] David J. Fleet,et al. Pix2seq: A Language Modeling Framework for Object Detection , 2021, ICLR.
[26] Adams Wei Yu,et al. SimVLM: Simple Visual Language Model Pretraining with Weak Supervision , 2021, ICLR.
[27] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[28] Oriol Vinyals,et al. Highly accurate protein structure prediction with AlphaFold , 2021, Nature.
[29] Wojciech Zaremba,et al. Evaluating Large Language Models Trained on Code , 2021, ArXiv.
[30] Oriol Vinyals,et al. Multimodal Few-Shot Learning with Frozen Language Models , 2021, NeurIPS.
[31] Sergey Levine,et al. Offline Reinforcement Learning as One Big Sequence Modeling Problem , 2021, NeurIPS.
[32] Pieter Abbeel,et al. Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.
[33] Ivo Danihelka,et al. Muesli: Combining Improvements in Policy Optimization , 2021, ICML.
[34] Chelsea Finn,et al. Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos , 2021, Robotics: Science and Systems.
[35] Tom Everitt,et al. Alignment of Language Agents , 2021, ArXiv.
[36] Quoc V. Le,et al. Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.
[37] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[38] Tim Rocktäschel,et al. My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control , 2020, ICLR.
[39] Misha Denil,et al. Offline Learning from Demonstrations and Unlabeled Experience , 2020, ArXiv.
[40] Wenlong Huang,et al. One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control , 2020, ICML.
[41] Gabriel Synnaeve,et al. Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters , 2020, INTERSPEECH.
[42] Nando de Freitas,et al. Critic Regularized Regression , 2020, NeurIPS.
[43] Yuval Tassa,et al. dm_control: Software and Tasks for Continuous Control , 2020, Softw. Impacts.
[44] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[45] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[46] Ville Hautamäki,et al. Benchmarking End-to-End Behavioural Cloning on Video Games , 2020, 2020 IEEE Conference on Games (CoG).
[47] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[48] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[49] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[50] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Oleg O. Sushkov,et al. Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.
[52] H. Francis Song,et al. V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control , 2019, ICLR.
[53] Misha Denil,et al. Task-Relevant Adversarial Imitation Learning , 2019, CoRL.
[54] Sergio Gomez Colmenarejo,et al. RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning , 2020 .
[55] S. Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[56] Stuart Russell. Human Compatible: Artificial Intelligence and the Problem of Control , 2019 .
[57] Lav R. Varshney,et al. CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.
[58] Ali Farhadi,et al. OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Orhan Firat,et al. Massively Multilingual Neural Machine Translation , 2019, NAACL.
[60] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[61] Inioluwa Deborah Raji,et al. Model Cards for Model Reporting , 2018, FAT.
[62] Thien Huu Nguyen,et al. BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning , 2018, ICLR.
[63] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[64] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[65] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[66] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[67] Tao Chen,et al. Hardware Conditioned Policies for Multi-Robot Transfer Learning , 2018, NeurIPS.
[68] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[69] Radu Soricut,et al. Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning , 2018, ACL.
[70] Kaiming He,et al. Group Normalization , 2018, ECCV.
[71] Jürgen Schmidhuber,et al. One Big Net For Everything , 2018, ArXiv.
[72] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[73] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[74] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[75] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[76] Lukasz Kaiser,et al. One Model To Learn Them All , 2017, ArXiv.
[77] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[78] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[79] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[80] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[81] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[82] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[83] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[84] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[85] Nando de Freitas,et al. Neural Programmer-Interpreters , 2015, ICLR.
[86] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[87] Xinlei Chen,et al. Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.
[88] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[89] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[90] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[91] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[92] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[93] J. Hawkins,et al. On Intelligence , 2004 .
[94] P. Bach-y-Rita,et al. Sensory substitution and the human–machine interface , 2003, Trends in Cognitive Sciences.
[95] N. Whitman. A bitter lesson. , 1999, Academic medicine : journal of the Association of American Medical Colleges.
[96] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[97] V. Mountcastle,et al. An organizing principle for cerebral function : the unit module and the distributed system , 1978 .