A MATLAB toolbox for Principal Component Analysis and unsupervised exploration of data structure

Abstract Principal Component Analysis is a multivariate method to project data in a reduced hyperspace, defined by orthogonal principal components, which are linear combinations of the original variables. In this way, data dimension can be reduced, noise can be excluded from the subsequent analysis, and therefore, data interpretation is extremely facilitated. For these reasons, Principal Component Analysis is nowadays the most common chemometric strategy for unsupervised exploratory data analysis. In this paper, the PCA toolbox for MATLAB is described. This is a collection of modules for calculating Principal Component Analysis, as well as Cluster Analysis and Multidimensional Scaling, which are two other well-known multivariate methods for unsupervised data exploration. The toolbox is freely available via Internet and comprises a graphical user interface (GUI), which allows the calculation in an easy-to-use graphical environment. It aims to be useful for both beginners and advanced users. The use of the toolbox is discussed here with an appropriate practical example.