Robust Principal Component Analysis Based on Discriminant Information

Recently, several robust principal component analysis (RPCA) models were presented to enhance the robustness of PCA by exploiting the robust norms as their loss functions. But an important problem is that they have no ability to discriminate outliers from correct samples. To solve this problem, we propose a RPCA method based on discriminant information (RPCA-DI). RPCA-DI disentangles the robust PCA with a two-step fashion: the identification and the processing of outliers. To identity outliers, a sample representation model based on entropy regularization is constructed to analyze the membership of data belonging to the principal component space(PC) and its orthogonal complement(OC), the discriminative information of data will be extracted based on measuring the differences of retained information on PC(or OC) of data. By this way, we can discriminate correct samples when we deal with outliers, which is more reasonable for robustness learning respective to previous works. In the noise processing step, in addition to considering the levels of noise, the resistance of the sample points to noise is also considered to prevent overfitting, thereby improving the generalization performance of RPCA-DI. Finally, an iterative algorithm is designed to solve the corresponding model. Compared with some state-of-art RPCA methods on artificial datasets, UCI datasets and face databases that verifies the effectiveness of RPCA-DI.