Assessment of Reliability when Test Items are not Essentially τ-Equivalent

Estimation of reliability has been a major issue in the 20 century psychometrics; so it is surprising that in practice reliability analysis is usually limited to the computation of α and retest coefficients. Namely, it is well-known that coefficient α is an accurate measure of reliability only if the test items are essentially τ-equivalent; in other cases, it is a lower bound for reliability. In the present paper, some alternative methods which do not require so strict assumptions are described. Probably the most interesting among them are Jöreskog’s ML analysis of congeneric measures and Jackson and Agunwamba’s greatest lower bound for reliability. These methods’ strengths and weaknesses and possibilities for use in psychometric practice are critically discussed. The procedures and their properties are illustrated on several sets of simulated and real (Big Five Questionnaire standardisation, national final high-school examination) data sets. The results show how the adoption of an incorrect measurement model can cause severe underestimation of the reliability coefficient.