Logistic Regression
暂无分享,去创建一个
Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of 1/0, with 1 generally indicating a success and 0 a failure. However, the actual values that 1 and 0 can take vary widely, depending on the purpose of the study. For example, for a study of the odds of failure in a school setting, 1 may have the value of fail, and 0 of not-fail, or pass. The important point is that 1 indicates the foremost subject of interest for which a binary response study is designed. Modeling a binary response variable using normal linear regression introduces substantial bias into the parameter estimates. The standard linear model assumes that the response and error terms are normally or Gaussian distributed, that the variance, σ, is constant across observations, and that observations in the model are independent. When a binary variable is modeled using this method, the first two of the above assumptions are violated. Analogical to the normal regression model being based on the Gaussian probability distribution function (pdf), a binary response model is derived from a Bernoulli distribution, which is a subset of the binomial pdf with the binomial denominator taking the value of 1. The Bernoulli pdf may be expressed as: f(yi;πi) = π yi i (1− πi) 1−yi . (1)
[1] J. A. Calvin. Regression Models for Categorical and Limited Dependent Variables , 1998 .
[2] S. Domínguez-Almendros,et al. Logistic regression models. , 2011, Allergologia et immunopathologia.
[3] J. Hardin,et al. Generalized Linear Models and Extensions , 2001 .
[4] David R. Cox. The analysis of binary data , 1970 .
[5] David W. Hosmer,et al. Applied Logistic Regression , 1991 .