A pseudo knockoff filter for correlated features

In 2015, Barber and Candes introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and prove that this method achieves exact FDR control. Inspired by the work of Barber and Candes (2015), we propose and analyze a pseudo-knockoff filter that inherits some advantages of the original knockoff filter and has more flexibility in constructing its knockoff matrix. Moreover, we perform a number of numerical experiments that seem to suggest that the pseudo knockoff filter with the half Lasso statistic has FDR control and offers more power than the original knockoff filter with the Lasso Path or the half Lasso Statistic for the numerical examples that we consider in this paper. Although we cannot establish rigorous FDR control for the pseudo knockoff filter, we provide some partial analysis of the pseudo knockoff filter with the half Lasso statistic and establish a uniform FDP bound and an expectation inequality.

[1]  Ran Dai,et al.  The knockoff filter for FDR control in group-sparse and multitask regression , 2016, ICML.

[2]  Alan J. Miller Subset Selection in Regression , 1992 .

[3]  Xiaochun Cao,et al.  False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking , 2016, ICML.

[4]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[5]  E. Candès,et al.  A knockoff filter for high-dimensional selective inference , 2016, The Annals of Statistics.

[6]  Lucas Janson,et al.  Familywise error rate control via knockoffs , 2015, 1505.06549.

[7]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[8]  E. Candès,et al.  Controlling the false discovery rate via knockoffs , 2014, 1404.5609.

[9]  Chiara Sabatti,et al.  MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS. , 2017, The annals of applied statistics.

[10]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[11]  Alexandra Chouldechova,et al.  False Discovery Rate Control for Sequential Selection Procedures, with Application to the Lasso , 2013 .

[12]  Alan J. Miller Sélection of subsets of regression variables , 1984 .

[13]  Junyang Qian,et al.  Communication-Efficient False Discovery Rate Control via Knockoff Aggregation , 2015, 1506.05446.

[14]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[15]  Larry A. Wasserman,et al.  Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models , 2010, NIPS.

[16]  T. Hou,et al.  A prototype knockoff filter for group selection with FDR control , 2017, Information and Inference: A Journal of the IMA.

[17]  R. Tibshirani,et al.  Sequential selection procedures and false discovery rate control , 2013, 1309.5352.

[18]  Robert Tibshirani,et al.  Sparse regression and marginal testing using cluster prototypes. , 2015, Biostatistics.

[19]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[20]  Lucas Janson,et al.  Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection , 2016, 1610.02351.