Normalization procedures and factorial representations for classification of correlation-aligned images: a comparative study.

We have addressed the problem of optimizing procedures of multivariate statistical analysis (MSA) for identifying homogeneous sets of electron micrographs of biological macromolecules, with a view to averaging over consistent sets of images. Using pre-aligned images of negatively stained protein molecules - known a priori to fall into two subtly different classes - we compared how the capacity to discriminate between them was affected by the normalization procedure used, and by the choice of factorial representation. Specifically, these images were analyzed both after being scaled according to constant minimum and maximum (CMM) values, and after imposing constant values of image mean and variance (CMV). The factorial representations compared were correspondence analysis (CA) and the principal components (PC) formalism. When used with PC, CMM normalization was found to give rise to spurious inter-image fluctuations that were more pronounced than the genuine difference between the two kinds of images; even with CA, CMV proved to be a more satisfactory method of normalization. When CMV was used with CA or PC, both factorial representations yielded qualitatively similar results, although according to a quantitative measure of inter-set discrimination, the performance of PC was slightly superior. Even in the best case, however, the two classes of images - as mapped in factorial space - were not fully resolved. The implications of this observation are discussed with regard to potential ambiguities of image classification in practice.

[1]  L. Finkelstein,et al.  Computer Processing of Electron Microscope Images , 1981 .

[2]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[3]  Marin van Heel,et al.  Multivariate statistical classification of noisy images (randomly oriented biological macromolecules) , 1984 .

[4]  W. O. Saxton,et al.  The correlation averaging of a regularly arranged bacterial cell envelope protein , 1982, Journal of microscopy.

[5]  M. Unser,et al.  A new resolution criterion based on spectral signal-to-noise ratios. , 1987, Ultramicroscopy.

[6]  Wolfgang Baumeister,et al.  Electron Microscopy at Molecular Dimensions , 1980, Proceedings in Life Sciences.

[7]  Joachim Frank,et al.  Use of multivariate statistics in analysing the images of biological macromolecules , 1981 .

[8]  J Frank,et al.  Averaging of low exposure electron micrographs of non-periodic objects. , 1975, Ultramicroscopy.

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  A. C. Steven,et al.  A procedure for evaluation of significant structural differences between related arrays of protein molecules , 1978 .

[11]  J Frank,et al.  Multivariate statistical analysis of ribosome electron micrographs. L and R lateral views of the 40 S subunit from HeLa cells. , 1982, Journal of molecular biology.

[12]  Gilbert Saporta,et al.  L'analyse des données , 1981 .

[13]  M Unser,et al.  Molecular substructure of a viral receptor-recognition protein. The gp17 tail-fiber of bacteriophage T7. , 1988, Journal of molecular biology.

[14]  D. Roberts,et al.  The statistical program SYSTAT : Wilkinson, L. (1986). SYSTAT, The System for Statistics. SYSTAT, Inc. 2902 Central Street, Evanston, IL 60201. , 1987 .

[15]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[16]  Jack Sklansky Pattern recognition: introduction and foundations , 1973 .

[17]  F. Cailliez Analyse des données , 1984 .