Moderating effects of subgroups in linear models

SUMMARY Possibilities for moderating effects of a subgrouping variable on strength or direction of an association have been much discussed by social scientists but have not been given satisfactory statistical formulations. The results concern directed measures of associations in linear models containing just three variables. Some key words: Analysis of covariance; Analysis of variance; cG-distribution; Conditional independence; Graphical chain model; Parallel regressions; Yule-Simpson paradox. 1. INTRODUCTION Linear models are commonly used as a framework to estimate and test how a continuous response variable depends on potential influencing variables. This paper is concerned with the situation in which two influences are random variables as well, which may be discrete or continuous. The continuous random variables are to stand for quantitative properties of observational units, while the discrete variables either capture qualitative properties or they represent subgroups of the population to which the observational units belong. To think of all variables as random variables is usually appropriate for data obtained from observational studies, where one cannot control which levels or values of the influencing variables are to be observed. The linear model permits an analysis conditional on fixed levels and values of the influences. But frequently, the association structure among the influences is, in social science applications, of interest itself and, as it will turn out, it is important for a correct interpretation of how the response depends on the influences. Awareness of such problems in interpretation is widespread among social scientists as is documented by extensive discussions of moderating effects; see, for example, Saunders (1956), Zedeck (1971) and Baron & Kenny (1986). There, a variable is called a moderator if its presence changes the strength or direction of an association. However, methods recommended to social scientists for identifying moderating effects given in the above literature or by Cohen & Cohen (1983, pp. 310-4) have been shown to be seriously deficient (Wermuth, 1988). Thus, there is need for clarification. We treat the case of a potential discrete moderator variable and call its categories or levels the subgroups in the linear model. A moderating effect is closely tied to the notion of consistent results on an association (Wermuth, 1987). In the present context results are said to be weakly consistent if the associations within subgroups coincide in direction and strongly consistent if the associ- ations within subgroups coincide in direction and strength. This notion depends typically on the chosen measure of association. In linear models we look at directed measures of