We develop a transform-invariant PCA (TIPCA) technique which aims to accurately characterize the intrinsic structures of the human face that are invariant to the in-plane transformations of the training images. Specially, TIPCA alternately aligns the image ensemble and creates the optimal eigenspace, with the objective to minimize the mean square error between the aligned images and their reconstructions. The learning from the FERET facial image ensemble of 1,196 subjects validates the mutual promotion between image alignment and eigenspace representation, which eventually leads to the optimized coding and recognition performance that surpasses the handcrafted alignment based on facial landmarks. Experimental results also suggest that state-of-the-art invariant descriptors, such as local binary pattern (LBP), histogram of oriented gradient (HOG), and Gabor energy filter (GEF), and classification methods, such as sparse representation based classification (SRC) and support vector machine (SVM), can benefit from using the TIPCA-aligned faces, instead of the manually eye-aligned faces that are widely regarded as the ground-truth alignment. Favorable accuracies against the state-of-the-art results on face coding and face recognition are reported.