Learning multi-view visual correspondences with self-supervision