Algorithm for Bi-directional Reduce Feature Data based on the Principal Component Analysis and Immune Clustering

It is presented and implemented an effective data analysis approach with two functions of data compression, which can not only reduce the dimension of data by getting rid of the correlation among them but also remove the duplicated or proximately similar data. At the first step of the algorithm, the principal component analysis (PCA) approach is adopted to reduce the dimension of data. Next, a modified immune clustering method inspired by the clonal section operation and immune network hypothesis of vertebrate's immune system is used to remove the unrepresentative samples according to the methodology of clustering but different to standard clustering approaches. The redefinition of affinity based on similarity measurement of principle component core is a creative modification to original algorithm. Plus other mending steps such as the normalization of antigen data and deleting duplicated data directly, a series of improvements make the new algorithm more efficient and applicable than the original one. Simulation experiments on the data from Tennessee Eastman plant that are popularly used in process control field have proved the effectiveness of this algorithm.