Linear regression is a broad and well-developed area of statistics. If there is a core to statistical methodology, then linear regression is it. The ubiquity of linear regression methods in statistics and data analytics stems from the ease with which one may fit tractable models that describe the primary features of a process or population. Not only is linear regression useful for description, it’s also very useful for prediction since the models often provide good approximations of complex relationships. In the field of statistics, hypothesis testing and confidence intervals are routinely used in linear regression analyses. The extension of these methods to data science is often unsuccessful because of the prevalence of opportunistically collected data. Most of the time, opportunistically collected data cannot support inferential methods because the quality of the inferences produced by the methods is unknown. We discuss inference herein so that the reader may understand the potential for success and for failure of these methods. However, the focus is on the essential and most useful aspects of the subject matter for data analytics—the fitted models. The topic of linear regression provides an avenue to gain experience with the statistical package R, one of the most popular software packages used by data scientists.
[1]
Trevor Hastie,et al.
An Introduction to Statistical Learning
,
2013,
Springer Texts in Statistics.
[2]
R. Telford,et al.
Sex, sport, and body-size dependency of hematology in highly trained athletes.
,
1991,
Medicine and science in sports and exercise.
[3]
J. Durnin,et al.
Determination of body composition from skinfold thickness: a validation study.
,
1995,
Archives of disease in childhood.
[4]
I. Chen,et al.
Fatalism and Risk of Adolescent Depression
,
2000,
Psychiatry.
[5]
David B. Shmoys,et al.
Data Analysis and Optimization for (Citi)Bike Sharing
,
2015,
AAAI.
[6]
Hadi Fanaee-T,et al.
Event labeling combining ensemble detectors and background knowledge
,
2014,
Progress in Artificial Intelligence.
[7]
Daniel W. Schafer,et al.
Student Solutions Manual for Ramsey/Schafer's The Statistical Sleuth: A Course in Methods of Data Analysis, 3rd
,
2012
.
[8]
John H. Maindonald,et al.
Comprar Data Analysis and Graphics Using R | John Maindonald | 9780521762939 | Cambridge University Press
,
2010
.
[9]
Trevor Hastie,et al.
The Elements of Statistical Learning
,
2001
.