Factor Analysis and Ordinal Data

Researchers in the behavioral sciences have for some time used the related procedures of principal components and factor analysis in an attempt to validate tests or other measurement systems. Many articles have been written recommending when to use principal components, principal axis factoring, Maximum-Likelihood estimators, etc.. In addition, the data analyst has learned he/she must choose among recommendations concerning the use of rotations to achieve simple structure, - varimax, oblique, promax, etc. However, until recently, researchers in many fields have not known that they should also attend to the type of correlation/covariance matrix analyzed. The near standard use of a Pearson correlation matrix is no doubt due in part to the availability of computer programs which by default make use of a Pearson matrix. Although the literature has for some time suggested that it is incorrect to treat nominal and ordinal data as interval or ratio (Anderson, 1961; Armstrong, 1981; Stevens, 1946, 1951), researchers are apparently failing to heed the warnings when computing correlations. Several excellent articles have been published in recent years which explain the scale problem in great detail and suggest alternative procedures (Gaito, 1980; Marcus-Roberts & Roberts, 1987; Mislevy, 1986; Muthen 1983, 1984, 1988; Muthen & Kaplan, 1985). Yet, it is common practice, if a failure to mention the type of matrix analyzed is any indication, to factor analyze a Pearson correlation matrix when the investigator is attempting to establish the validity of a Likert scale. The purpose of this paper is to present examples demonstrating the results of different approaches to model specification rather than considering the scaling problem in detail. The scaling problems have been discussed elsewhere (Joreskog & Sorbom, 1986, 1988; Marcus-Roberts et al., 1987; Mislevey, 1986; Muthen, 1983, 1984, 1988; Muthen et al., 1985) and the interested reader is referred to these excellent articles for complete detail. It does need to be mentioned here that ordinal variables do not have a metric scale and tend to be attenuated due to the severe restriction on range. However, an ordinal variable z can be thought of as a crude representation of the unobserved continuous variable z*. The correlation between two ordinal variable |z*.sub.1~ and |z*.sub.2~ is known as the polychoric correlation coefficient, and is an estimate of the unobserved relationship between the two variables. The Monte Carlo studies of Joreskog and Sorbom (1986) and data presented by Muthen and Kaplan (1985) suggest that polychoric correlations should be the procedure of choice when considering the type of matrix to analyze. According to their work it would be just as inappropriate to analyze a Pearson matrix computed from ordinal data as to use a correlated t-test when the sample design dictated a separate sample test. In addition, the excellent article by Muthen and Kaplan gives the data analyst guidance when dealing with non-normal Likert variables. Their evidence suggests that in the presence of strong skewness and/or kurtosis - as is often the case with Liken items - it may be more appropriate to treat the variables as z*'s. Joreskog and Sorbom (1988) have presented the results of confirmatory factor analyses of data collected on Swedish school children, computed under four different conditions; (a) normal theory (GLS) of a Pearson correlation matrix, (b) normal theory (GLS) or a polychoric matrix, (c) non-Normal theory using a Pearson matrix, and (d) non-normal theory using a polychoric matrix. They indicated that only the weighted least squares with polychoric correlations returns asymptotically correct results. Weighted least squares using product-moment correlations based on normal scores is biased while the standard errors of the GLS estimates are wrong because the formula is incorrect. For the illustrations in this paper we have chosen to analyze the data matrix presented by Joreskog et at. …