The distribution of certain regression statistics

Abstract : There are a number of variable selection procedures in current use in multiple regression analysis. In some of these procedures, a new variable is entered into regression when its contribution to the explanation of the response is larger than that of any other variable not yet entered. The new entrant is then tested by a standard F-test to see if its contribution is 'significant'. Because of the method of selecting the largest contributor, an F-test of this type cannot be theoretically correct but the procedure is used because the exact distributions are not known. This raises questions like 'Is one seriously in error in using the F-tests.' 'If so, can adjustments be made to the F-test to correct the difficulty.' The purpose of this paper is to discuss some new results which throw light on these questions. (Author)