论文信息 - Learning Transferable Reward for Query Object Localization with Policy Adaptation

Learning Transferable Reward for Query Object Localization with Policy Adaptation

We propose a reinforcement learning based approach to query object localization, for which an agent is trained to localize objects of interest specified by a small exemplary set. We learn a transferable reward signal formulated using the exemplary set by ordinal metric learning. Our proposed method enables test-time policy adaptation to new environments where the reward signals are not readily available, and outperforms fine-tuning approaches that are limited to annotated images. In addition, the transferable reward allows repurposing the trained agent from one specific class to another class. Experiments on corrupted MNIST, CU-Birds, and COCO datasets demonstrate the effectiveness of our approach.

Dimitris N. Metaxas | Martin Renqiang Min | Shaobo Han | Tingfeng Li

[1] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[2] Jiayu Zhou,et al. Transfer Learning in Deep Reinforcement Learning: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Alexei A. Efros,et al. Self-Supervised Policy Adaptation during Deployment , 2020, ICLR.

[4] S. Levine,et al. Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning , 2020, CoRL.

[5] Trevor Darrell,et al. Frustratingly Simple Few-Shot Object Detection , 2020, ICML.

[6] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[7] Yisong Yue,et al. Landmark Ordinal Embedding , 2019, NeurIPS.

[8] Deva Ramanan,et al. Meta-Learning to Detect Rare Objects , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9] Alexei A. Efros,et al. Test-Time Training with Self-Supervision for Generalization under Distribution Shifts , 2019, ICML.

[10] Xiaodan Liang,et al. Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11] Javed A. Aslam,et al. Scaling Up Ordinal Embedding: A Landmark Approach , 2019, ICML.

[12] Jitendra Malik,et al. Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Xin Wang,et al. Few-Shot Object Detection via Feature Reweighting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[15] Jakub M. Tomczak,et al. Attention-based Deep Multiple Instance Learning , 2018, ICML.

[16] Jiebo Luo,et al. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Yao Li,et al. Deep Descriptor Transforming for Image Co-Localization , 2017, IJCAI.

[18] Lucas Beyer,et al. In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[19] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[20] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[21] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.

[22] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[23] Shuicheng Yan,et al. Tree-Structured Reinforcement Learning for Sequential Object Localization , 2016, NIPS.

[24] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[25] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[27] Svetlana Lazebnik,et al. Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Ivan Laptev,et al. Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Max Jaderberg,et al. Spatial Transformer Networks , 2015, NIPS.

[31] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.

[34] Ivan Laptev,et al. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35] Fei-Fei Li,et al. Co-localization in Real-World Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Ulrike von Luxburg,et al. Local Ordinal Embedding , 2014, ICML.

[37] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[38] Zaïd Harchaoui,et al. On learning to localize objects with minimal supervision , 2014, ICML.

[39] K. V. D. Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[40] V. Ferrari,et al. Weakly Supervised Localization and Learning with Generic Knowledge , 2012, International Journal of Computer Vision.

[41] Robert D. Nowak,et al. Low-dimensional embedding using adaptively selected ordinal data , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[42] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[43] Tie-Yan Liu,et al. Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[44] David J. Kriegman,et al. Generalized Non-metric Multidimensional Scaling , 2007, AISTATS.

[45] AnYuan Guo,et al. Decision-theoretic active sensing for autonomous agents , 2003, AAMAS '03.

[46] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.