Improving Vision-Language Cross-Lingual Transfer with Scheduled Unfreezing