Deep metric learning for visual servoing: when pose and image meet in latent space