Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
暂无分享,去创建一个
R. Stiefelhagen | R. Liu | Kailun Yang | Kunyu Peng | Jiaming Zhang | Yufan Chen | Ke Cao | Junwei Zheng
[1] Haiying Xia,et al. A dataset for the visually impaired walk on the road , 2023, Displays.
[2] Seungkyu Lee,et al. Faster Segment Anything: Towards Lightweight SAM for Mobile Applications , 2023, ArXiv.
[3] Tao Yu,et al. Fast Segment Anything , 2023, ArXiv.
[4] Ross B. Girshick,et al. Segment Anything , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[5] Jun-Juan Zhu,et al. Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection , 2023, ECCV.
[6] Guoxin Li,et al. Sensing and Navigation of Wearable Assistance Cognitive Systems for the Visually Impaired , 2023, IEEE Transactions on Cognitive and Developmental Systems.
[7] R. Stiefelhagen,et al. MateRobot: Material Recognition in Wearable Robotics for People with Visual Impairments , 2023, ArXiv.
[8] Yan Zhang,et al. "I am the follower, also the boss": Exploring Different Levels of Autonomy and Machine Forms of Guiding Robots for the Visually Impaired , 2023, CHI.
[9] S. Savarese,et al. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models , 2023, ICML.
[10] Yi Wang,et al. Learning Open-Vocabulary Semantic Segmentation Models From Natural Language Supervision , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] D. Gurari,et al. Salient Object Detection for Images Taken by People With Vision Impairments , 2023, ArXiv.
[12] Wujie Zhou,et al. MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding , 2023, IEEE Transactions on Intelligent Vehicles.
[13] Weidi Xie,et al. Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models , 2022, BMVC.
[14] Bichen Wu,et al. Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Kuan-Ching Li,et al. EOS: An efficient obstacle segmentation for blind guiding , 2023, Future Gener. Comput. Syst..
[16] Weidong Min,et al. Traffic Sign Recognition Based on Semantic Scene Understanding and Structural Traffic Sign Location , 2022, IEEE Transactions on Intelligent Transportation Systems.
[17] A. Hauptmann,et al. GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement , 2022, ACM Multimedia.
[18] D. Gurari,et al. VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments , 2022, ECCV.
[19] Jianlong Fu,et al. TinyViT: Fast Pretraining Distillation for Small Vision Transformers , 2022, ECCV.
[20] Chia-Wen Lin,et al. Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion , 2022, IEEE Transactions on Image Processing.
[21] Suha Kwak,et al. Collaborative Transformers for Grounded Situation Recognition , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Miaojing Shi,et al. Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] J. A. D. Santos,et al. Conditional Reconstruction for Open-Set Semantic Segmentation , 2022, 2022 IEEE International Conference on Image Processing (ICIP).
[24] R. Stiefelhagen,et al. TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation , 2022, IEEE Transactions on Intelligent Transportation Systems.
[25] Shalini De Mello,et al. GroupViT: Semantic Segmentation Emerges from Text Supervision , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] S. Hoi,et al. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.
[27] Svenja Uhlemeyer,et al. Towards Unsupervised Open World Semantic Segmentation , 2022, UAI.
[28] M. Kawanabe,et al. ScanQA: 3D Question Answering for Spatial Scene Understanding , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Tat-Seng Chua,et al. Rethinking the Two-Stage Framework for Grounded Situation Recognition , 2021, AAAI.
[30] Dengxin Dai,et al. Both Style and Fog Matter: Cumulative Domain Adaptation for Semantic Foggy Scene Understanding , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Suha Kwak,et al. Grounded Situation Recognition with Transformers , 2021, BMVC.
[32] Zhengcai Cao,et al. Rapid Detection of Blind Roads and Crosswalks by Using a Lightweight Semantic Segmentation Network , 2021, IEEE Transactions on Intelligent Transportation Systems.
[33] Chen Zhao,et al. A dataset for the recognition of obstacles on blind sidewalk , 2021, Universal Access in the Information Society.
[34] Alexander G. Schwing,et al. Per-Pixel Classification is Not All You Need for Semantic Segmentation , 2021, NeurIPS.
[35] Lu Zhang,et al. Dynamic Crosswalk Scene Understanding for the Visually Impaired , 2021, IEEE Transactions on Neural Systems and Rehabilitation Engineering.
[36] Rainer Stiefelhagen,et al. Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
[37] Rainer Stiefelhagen,et al. HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
[38] Rainer Stiefelhagen,et al. MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding , 2021, IEEE Transactions on Intelligent Transportation Systems.
[39] Tien-Ying Kuo,et al. Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks † , 2021, Sensors.
[40] Yun Liu,et al. P2T: Pyramid Pooling Transformer for Scene Understanding , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[41] Anima Anandkumar,et al. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers , 2021, NeurIPS.
[42] Cordelia Schmid,et al. Segmenter: Transformer for Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[43] Luc Van Gool,et al. ACDC: The Adverse Conditions Dataset with Correspondences for Semantic Driving Scene Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[44] Zhenhua Chai,et al. Rethinking BiSeNet For Real-time Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Didier Stricker,et al. A Comparison of Single and Multi-View IR image-based AR Glasses Pose Estimation Approaches , 2021, 2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW).
[46] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[47] Xiang Li,et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[48] Tatsuo Arai,et al. A Wearable Navigation Device for Visually Impaired People Based on the Real-Time Semantic Visual SLAM System , 2021, Sensors.
[49] Mohammad Mahmudul Alam,et al. Unified learning approach for egocentric hand gesture recognition and fingertip detection , 2021, Pattern Recognit..
[50] Tao Xiang,et al. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] I-Hsuan Hsieh,et al. Outdoor walking guide for the visually-impaired people based on semantic segmentation and depth map , 2020, 2020 International Conference on Pervasive Artificial Intelligence (ICPAI).
[52] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[53] Xuming He,et al. Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images , 2020, MICCAI.
[54] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[55] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[56] Ali Farhadi,et al. Grounded Situation Recognition , 2020, ECCV.
[57] Yingda Xia,et al. Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation , 2020, ECCV.
[58] Tian Sheuan Chang,et al. Semantic Segmentation of Intracranial Hemorrhages in Head CT Scans , 2019, 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS).
[59] Shiguo Lian,et al. Deep Learning Based Wearable Assistive System for Visually Impaired People , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[60] Alexander H. Liu,et al. Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Edward K. Wong,et al. Cross-Safe: A Computer Vision-Based Approach to Make All Intersection-Related Pedestrian Signals Accessible for the Visually Impaired , 2019, Advances in Intelligent Systems and Computing.
[62] Jiaya Jia,et al. Situation Recognition with Graph Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[63] Stephen Gould,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[64] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016, 1606.08415.
[65] Ali Farhadi,et al. Situation Recognition: Visual Semantic Role Labeling for Image Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[67] Patrick Pérez,et al. The Semantic Paintbrush: Interactive 3D Mapping and Recognition in Large Outdoor Spaces , 2015, CHI.
[68] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[69] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[70] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[71] D. Kahneman. Maps of Bounded Rationality: Psychology for Behavioral Economics , 2003 .
[72] John B. Lowe,et al. The Berkeley FrameNet Project , 1998, ACL.
[73] Wenbin Zou,et al. Real-Time Passable Area Segmentation With Consumer RGB-D Cameras for the Visually Impaired , 2023, IEEE Transactions on Instrumentation and Measurement.
[74] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[75] Kuan-Wen Chen,et al. V-Eye: A Vision-Based Navigation System for the Visually Impaired , 2021, IEEE Transactions on Multimedia.