Automated cleaning of identity label noise in a large face dataset with quality control
暂无分享,去创建一个
For face recognition, some very large-scale datasets are publicly available in recent years, which are usually collected from the Internet using search engines, and thus have many faces with wrong identity (ID) labels (outliers). Additionally, the face images in these datasets have different qualities because of uncontrolled situations. The authors propose a novel approach for cleaning the ID label error, handling face images in different qualities. The face ID labels cleaned by their method can train better models for low-quality face recognition since more low-quality images are correctly labelled for training a deep model. In their low-to-high-quality face verification experiments, the deep model trained on their cleaning results of MS-Celeb-1M.v1 face dataset outperforms the same model trained on the same dataset cleaned by the semantic bootstrapping method. They also apply their ID label cleaning method on a subset of the cross-age celebrity dataset (CACD) face dataset, in which their quality-based cleaning can deliver higher precision and recall than a previous method on detecting the ID label errors.
[1] Tieniu Tan,et al. A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.
[2] Matti Pietikäinen,et al. Robust local features for remote face recognition , 2017, Image Vis. Comput..
[3] Yu Deng,et al. Face Image Quality Assessment Based on Learning to Rank , 2015, IEEE Signal Processing Letters.