Cross-Lingual Text Classification with Multilingual Distillation and Zero-Shot-Aware Training