Learning to Remember Beauty Products

This paper develops a deep learning model for the beauty product image retrieval problem. The proposed model has two main components- an encoder and a memory. The encoder extracts and aggregates features from a deep convolutional neural network at multiple scales to get feature embeddings. With the use of an attention mechanism and a data augmentation method, it learns to focus on foreground objects and neglect background on images, so can it extract more relevant features. The memory consists of representative states of all database images as its stacks, and it can be updated during training process. Based on the memory, we introduce a distance loss to regularize embedding vectors from the encoder to be more discriminative. Our model is fully end-to-end, requires no manual feature aggregation and post-processing. Experimental results on the Perfect-500K dataset demonstrate the effectiveness of the proposed model with a significant retrieval accuracy.

[1]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Atsuto Maki,et al.  Visual Instance Retrieval with Deep Convolutional Networks , 2014, ICLR.

[3]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[4]  Kai Xu,et al.  Beauty Product Image Retrieval Based on Multi-Feature Fusion and Feature Aggregation , 2018, ACM Multimedia.

[5]  Jiawei Wang,et al.  The Retrieval of the Beautiful: Self-Supervised Salient Object Detection for Beauty Product Retrieval , 2019, ACM Multimedia.

[6]  Simon Osindero,et al.  Cross-Dimensional Weighting for Aggregated Deep Convolutional Features , 2015, ECCV Workshops.

[7]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[8]  Victor S. Lempitsky,et al.  Aggregating Deep Convolutional Features for Image Retrieval , 2015, ArXiv.

[9]  Kai Chen,et al.  Real-time Scene Text Detection with Differentiable Binarization , 2019, AAAI.

[10]  Haoran Xie,et al.  Cross-domain Beauty Item Retrieval via Unsupervised Embedding Learning , 2019, ACM Multimedia.

[11]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Lingyun Yu,et al.  Beauty Product Retrieval Based on Regional Maximum Activation of Convolutions with Generalized Attention , 2019, ACM Multimedia.

[13]  Zhenguo Yang,et al.  Regional Maximum Activations of Convolutions with Attention for Cross-domain Beauty and Personal Care Product Retrieval , 2018, ACM Multimedia.

[14]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[16]  Giorgos Tolias,et al.  Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[19]  Aurko Roy,et al.  Learning to Remember Rare Events , 2017, ICLR.

[20]  Yi Zhang,et al.  Beauty Aware Network: An Unsupervised Method for Makeup Product Retrieval , 2019, ACM Multimedia.

[21]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.