Estimation of Regressions Involving Logarithmic Transformation of Zero Values in the Dependent Variable

In regression analysis, the observed values are often transformed into logarithmic values. How-ever, when some of the original values are zero, their logarithmic values are negative infinity, and thus cause difficulty in estimating the parameters of the regression. The problem arises frequently in empirical studies of the earninlgs function wYhich use the logarithmic values of earnings as the dependent variable, e.g., [1]. In order to overcome this difficulty, some studies simply discard the corresponding observations. Discarding part of the observations seems to be unappealing, since the available data do not seema to be fully utilized. Thus, others attempt to include these observations by arbitrarily setting the log-value of the dependent variable to be zero, wTith perhaps a dummy variable to indicate such a treatment for particular observations, e.g., t2]. Neither of these approaches has been based on rigorous argument or reasonable justification. The purpose of this paper is to demonstrate that, for practical purposes, the former procedure appears to be more appropriate than the latter. We shall also examine the relationship betwveen the estimated values of the parameters obtained by these two approaches. Consider a simple function in the form of