Limited Dependent Variables and Discrete Choice Modelling

Limited dependent variables considers regression models where the dependent variable takes limited values like zero and one for binary choice mowedels, or a multinomial model where there is a few choices like modes of transportation, for example, bus, train, or a car. Binary choice examples in economics include a woman’s decision to participate in the labor force, or a worker’s decision to join a union. Other examples include whether a consumer defaults on a loan or a credit card, or whether they purchase a house or a car. This qualitative variable is recoded as one if the female participates in the labor force (or the consumer defaults on a loan) and zero if she does not participate (or the consumer does not default on the loan). Least squares using a binary choice model is inferior to logit or probit regressions. When the dependent variable is a fraction or proportion, inverse logit regressions are appropriate as well as fractional logit quasi-maximum likelihood. An example of the inverse logit regression is the effect of beer tax on reducing motor vehicle fatality rates from drunken driving. The fractional logit quasi-maximum likelihood is illustrated using an equation explaining the proportion of participants in a pension plan using firm data. The probit regression is illustrated with a fertility empirical example, showing that parental preferences for a mixed sibling-sex composition in developed countries has a significant and positive effect on the probability of having an additional child. Multinomial choice models where the number of choices is more than 2, like, bond ratings in Finance, may have a natural ordering. Another example is the response to an opinion survey which could vary from strongly agree to strongly disagree. Alternatively, this choice may not have a natural ordering like the choice of occupation or modes of transportation. The Censored regression model is motivated with estimating the expenditures on cars or estimating the amount of mortgage lending. In this case, the observations are censored because we observe the expenditures on a car (or the mortgage amount) only if the car is bought or the mortgage approved. In studying poverty, we exclude the rich from our sample. In this case, the sample is not random. Applying least squares to the truncated sample leads to biased and inconsistent results. This differs from censoring. In the latter case, no data is excluded. In fact, we observe the characteristics of all mortgage applicants even those that do not actually get their mortgage approved. Selection bias occurs when the sample is not randomly drawn. This is illustrated with a labor participating equation (the selection equation) and an earnings equation, where earnings are observed only if the worker participates in the labor force, otherwise it is zero. Extensions to panel data limited dependent variable models are also discussed and empirical examples given.