Dual-path Convolutional Image-Text Embeddings with Instance Loss

Matching images and sentences demands a fine understanding of both modalities. In this article, we propose a new system to discriminatively embed the image and text to a shared visual-textual space...