Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning