Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training