论文信息 - Supplement to : iLab 20 M : A large-scale controlled object dataset to investigate deep learning

Supplement to : iLab 20 M : A large-scale controlled object dataset to investigate deep learning

Experiment I: category prediction In this experiment, we randomly select 2K samples from 7 categories (boat, bus, f1car, tank, train, ufo, and van) and feed them to a pretrained CNN model, specifically Alexnet. Having fc7 and pool5 representations of selected samples ready, we use the t-SNE algorithm to reduce their dimensionality to 2D. In addition, 20K images are randomly selected from all 7 categories and the network is fine-tuned on the provided data for object categorization. The same procedure is carried out on the fine-tuned (FT) network. Fig. 1 depicts the results. Our results in Fig. 1 show that fc7 representation works remarkably well at recognizing object level categories as they are mutually linearly separable after fine-tuning the network. Furthermore, pool5 representation does not contain discriminative information between object categories compared to fc7. This result is in alignment with Bakry et al., [1]. Fig. 1 also demonstrates the effect of fine-tuning on feature spaces. The distributions of samples for different categories tend to become very compact and concentrated after fine-tuning. Notice that fine-tuning does not add more discriminative power to the pool5 representation.

L. Itti | A. Borji | Saeed Izadi | S. Izadi

[1] Ahmed M. Elgammal,et al. Digging Deep into the Layers of CNNs: In Search of How CNNs Achieve View Invariance , 2015, ICLR.

[2] Ali Borji,et al. What can we learn about CNNs from a large scale controlled object dataset? , 2015, ArXiv.

[3] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .