AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
暂无分享,去创建一个
Jianlong Fu | Ruihua Song | Jiange Yang | Limin Wang | Bei Liu | Chuhao Jin | Wenhui Tan
[1] Andy Zeng,et al. TidyBot: Personalized Robot Assistance with Large Language Models , 2023, ArXiv.
[2] M. Pavone,et al. Text2Motion: From Natural Language Instructions to Feasible Plans , 2023, ArXiv.
[3] Mehdi S. M. Sajjadi,et al. PaLM-E: An Embodied Multimodal Language Model , 2023, ICML.
[4] Peter R. Florence,et al. Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control , 2023, ArXiv.
[5] Naman Goyal,et al. LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.
[6] Sjoerd van Steenkiste,et al. Scaling Vision Transformers to 22 Billion Parameters , 2023, ICML.
[7] S. Savarese,et al. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models , 2023, ICML.
[8] S. Levine,et al. RT-1: Robotics Transformer for Real-World Control at Scale , 2022, Robotics: Science and Systems.
[9] Ledell Yu Wu,et al. EVA: Exploring the Limits of Masked Visual Representation Learning at Scale , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Peter R. Florence,et al. Interactive Language: Talking to Robots in Real Time , 2022, IEEE Robotics and Automation Letters.
[11] Jessica Borja-Diaz,et al. Grounding Language with Visual Affordances over Unstructured Data , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).
[12] Peter R. Florence,et al. Code as Policies: Language Model Programs for Embodied Control , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).
[13] D. Fox,et al. Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation , 2022, CoRL.
[14] Ricardo Garcia Pinel,et al. Instruction-driven history-aware policies for robotic manipulations , 2022, CoRL.
[15] Peter R. Florence,et al. Inner Monologue: Embodied Reasoning through Planning with Language Models , 2022, CoRL.
[16] S. Levine,et al. Multimodal Masked Autoencoders Learn Transferable Representations , 2022, ArXiv.
[17] Xi Victoria Lin,et al. OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.
[18] Oier Mees,et al. What Matters in Language Conditioned Robotic Imitation Learning Over Unstructured Data , 2022, IEEE Robotics and Automation Letters.
[19] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[20] S. Levine,et al. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances , 2022, CoRL.
[21] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[22] Sergey Levine,et al. BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning , 2022, CoRL.
[23] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[24] W. Burgard,et al. CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks , 2021, IEEE Robotics and Automation Letters.
[25] Lihui Wang,et al. Towards proactive human–robot collaboration: A foreseeable cognitive manufacturing paradigm , 2021, Journal of Manufacturing Systems.
[26] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[27] Vikram Srinivasan,et al. Spatial Reasoning from Natural Language Instructions for Robot Manipulation , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[28] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[29] Qiang Ni,et al. Cognitive computing and wireless communications on the edge for healthcare service robots , 2020, Comput. Commun..
[30] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[31] Danica Kragic,et al. Trends and challenges in robot manipulation , 2019, Science.
[32] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[33] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[34] Sanem Sariel,et al. Cognition-Enabled Robot Manipulation in Human Environments: Requirements, Recent Work, and Open Problems , 2017, IEEE Robotics & Automation Magazine.
[35] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[38] Ashish Kapoor,et al. ChatGPT for Robotics: Design Principles and Model Abilities , 2023, IEEE Access.
[39] Erran L. Li,et al. Self-Play and Self-Describe: Policy Adaptation with Vision-Language Foundation Models , 2022, ArXiv.
[40] P. Abbeel,et al. Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models , 2022, ArXiv.