论文信息 - Visually grounded models of spoken language: A survey of datasets, architectures and evaluation techniques - 字舞流文

Visually grounded models of spoken language: A survey of datasets, architectures and evaluation techniques

Grzegorz Chrupała