Attribute Structured Knowledge Distillation

Knowledge distillation aims at transferring sufficient knowledge from one cumbersome teacher network to another compressed student network. Most previous knowledge distillation methods mainly focus on mimicking the teacher's output of each instance or inter-instance relations from the teacher to the student, while neglecting the attribute structured relations at an instance level. In this paper, we propose a novel Attribute Structured Knowledge Distillation (ASKD) to transfer attribute-level structured relations. It models two types of structured relations, including inter-region structure and inter-class structure, which captures informative knowledge from the teacher model. Inter-region structure captures local feature relations, while interclass structure focuses on cross-class relations. Transferring such attribute-level structure from the teacher to the student would make the student focus on more informative knowledge of the teacher. Extensive experiments on public datasets, including CIFAR-100 and TinyImageNet, demonstrate the superiority of our method over the state-of-the-art methods under similar architecture and different architecture teacher-student pairs.