Vision-Language Models as Success Detectors
暂无分享,去创建一个
N. D. Freitas | Misha Denil | Felix Hill | Serkan Cabi | Ksenia Konyushkova | Jessica Landon | Yuqing Du | A. Raju
[1] Jing Yu Koh,et al. Grounding Language Models to Images for Multimodal Generation , 2023, ArXiv.
[2] S. Savarese,et al. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models , 2023, ArXiv.
[3] Jimmy Ba,et al. Mastering Diverse Domains through World Models , 2023, ArXiv.
[4] Tamara von Glehn,et al. Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback , 2022, ArXiv.
[5] S. Savarese,et al. Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training , 2022, EMNLP.
[6] Yecheng Jason Ma,et al. VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training , 2022, ICLR.
[7] Anima Anandkumar,et al. MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge , 2022, NeurIPS.
[8] Tamara von Glehn,et al. Evaluating Multimodal Interactive Agents , 2022, ArXiv.
[9] Sergio Gomez Colmenarejo,et al. A Generalist Agent , 2022, Trans. Mach. Learn. Res..
[10] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, NeurIPS.
[11] Abhinav Gupta,et al. Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation? , 2022, L4DC.
[12] Tom B. Brown,et al. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , 2022, ArXiv.
[13] Jacob Menick,et al. Teaching language models to support answers with verified quotes , 2022, ArXiv.
[14] Qun Liu,et al. Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation , 2022, FINDINGS.
[15] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[16] Xiaowei Hu,et al. Scaling Up Vision-Language Pretraining for Image Captioning , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] James M. Rehg,et al. Ego4D: Around the World in 3,000 Hours of Egocentric Video , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Trevor Darrell,et al. Zero-Shot Reward Specification via Grounded Natural Language , 2022, ICML.
[19] Jeff Wu,et al. WebGPT: Browser-assisted question-answering with human feedback , 2021, ArXiv.
[20] Tamara von Glehn,et al. Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning , 2021, ArXiv.
[21] Dario Amodei,et al. A General Language Assistant as a Laboratory for Alignment , 2021, ArXiv.
[22] Pieter Abbeel,et al. PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training , 2021, ICML.
[23] Chelsea Finn,et al. Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos , 2021, Robotics: Science and Systems.
[24] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[25] Quoc V. Le,et al. Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.
[26] Felix Hill,et al. Imitating Interactive Intelligence , 2020, ArXiv.
[27] Ryan J. Lowe,et al. Learning to summarize from human feedback , 2020, NeurIPS 2020.
[28] Yuval Tassa,et al. dm_control: Software and Tasks for Continuous Control , 2020, Softw. Impacts.
[29] Xilin Chen,et al. UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation , 2020, ArXiv.
[30] S. Levine,et al. Learning Human Objectives by Evaluating Hypothetical Behavior , 2019, ICML.
[31] Oleg O. Sushkov,et al. Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.
[32] Sergey Levine,et al. End-to-End Robotic Reinforcement Learning without Reward Engineering , 2019, Robotics: Science and Systems.
[33] Prabhat Nagarajan,et al. Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.
[34] Shane Legg,et al. Reward learning from human preferences and demonstrations in Atari , 2018, NeurIPS.
[35] Nando de Freitas,et al. Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.
[36] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[37] Shie Mannor,et al. End-to-End Differentiable Adversarial Imitation Learning , 2017, ICML.
[38] Anca D. Dragan,et al. Active Preference-Based Learning of Reward Functions , 2017, Robotics: Science and Systems.
[39] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[40] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[41] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[42] Guan Wang,et al. Interactive Learning from Policy-Dependent Human Feedback , 2017, ICML.
[43] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2016, International Journal of Computer Vision.
[44] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[45] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[46] Tao Mei,et al. MSR-VTT: A Large Video Description Dataset for Bridging Video and Language , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Michèle Sebag,et al. Programming by Feedback , 2014, ICML.
[49] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[50] Michèle Sebag,et al. APRIL: Active Preference-learning based Reinforcement Learning , 2012, ECML/PKDD.
[51] P. Stone,et al. TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.
[52] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[53] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[54] R. A. Bradley,et al. RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .