Feature Aggregation in Zero-Shot Cross-Lingual Transfer Using Multilingual BERT