Applied data analysis for social science

Summary on bivariate regression • In bivariate regression the OLS method finds the ”best”LINE or CURVE in a two dimensional scatter plot• Best is defined as the “a” and “b” that minimizes the sum of squared deviations between the line/ curve and observed variable values• Scatter-plot and analysis of residuals are tools for diagnosing problems in the regression• Transformation (by powers) is a general tool helping to mitigate several types of problems, such as – Curvilinearity– Heteroscedasticity– Non-normal distributions of residuals– Cases with too high influence • Regression with (power) transformed variables are always curvilinear. Results can most easily be interpreted by means of graphs Spring 2010 © Erling Berge 6 Multiple regression: model (1) • The goal of multiple regression is to find the net impact of one variable controlled for the impact of all other variables • Let K= number of parameters in the model (this means that K-1 is the number of variables)• Then the population model can be written