Feature Extraction and Selection

The computational complexity of a classification algorithm should be reduced to a sufficient minimum by reducing the number of features considered. We can either select the most informative features or extract a new, smaller set of features using a (linear) combination of the original features. Principal component analysis (PCA) implements the second option by diagonalizing the covariance matrix to find directions in feature space corresponding to the directions of greatest variance. Linear discriminant analysis (LDA) also reduces the dimensionality of a problem, but specifically finds the most useful directions for separating classes.