Influential data cases when the C

The influence of data cases when the C"p criterion is used for variable selection in multiple linear regression analysis is studied in terms of the predictive power and the predictor variables included in the resulting model when variable selection is applied. In particular, the focus is on the importance of identifying and dealing with these so-called selection influential data cases before model selection and fitting are performed. A new selection influence measure based on the C"p criterion to identify selection influential data cases is developed. The success with which this influence measure identifies selection influential data cases is evaluated in two example data sets.