论文信息 - When and why vision-language models behave like bags-of-words, and what to do about it? - 字舞流文

When and why vision-language models behave like bags-of-words, and what to do about it?

James Y. Zou | Dan Jurafsky | Federico Bianchi | Pratyusha Kalluri | Mert Yuksekgonul