Universal-KD: Attention-based Output-Grounded Intermediate Layer Knowledge Distillation