DEPTH: A Novel Algorithm for Feature Ranking with Application to Genome-Wide Association Studies

Variable selection is a common problem in regression modelling with a myriad of applications. This paper proposes a new feature ranking algorithm (DEPTH) for variable selection in parametric regression based on permutation statistics and stability selection. DEPTH is: (i)aapplicable to any parametric regression task, (ii)adesigned to be run in a parallel environment, and (iii)aadapts naturally to the correlation structure of the predictors. DEPTH was applied to a genome-wide association study of breast cancer and found evidence that there are variants in a pathway of candidate genes that are associated with a common subtype of breast cancer, a finding which would not have been discovered by conventional analyses.