Confidence Intervals for Selected Parameters

Practical or scientific considerations often lead to selecting a subset of parameters as ``important.'' Inferences about those parameters often are based on the same data used to select them in the first place. That can make the reported uncertainties deceptively optimistic: confidence intervals that ignore selection generally have less than their nominal coverage probability. Controlling the probability that one or more intervals for selected parameters do not cover---the ``simultaneous over the selected'' (SoS) error rate---is crucial in many scientific problems. Intervals that control the SoS error rate can be constructed in ways that take advantage of knowledge of the selection rule. We construct SoS-controlling confidence intervals for parameters deemed the most ``important'' $k$ of $m$ shift parameters because they are estimated (by independent estimators) to be the largest. The new intervals improve substantially over Sidak intervals when $k$ is small compared to $m$, and approach the standard Bonferroni-corrected intervals when $k \approx m$. Standard, unadjusted confidence intervals for location parameters have the correct coverage probability for $k=1$, $m=2$ if, when the true parameters are zero, the estimators are exchangeable and symmetric.

[1]  D. Yekutieli,et al.  Selective Sign-Determining Multiple Confidence Intervals with FCR Control , 2014, Statistica Sinica.

[2]  Aaditya Ramdas,et al.  Towards "simultaneous selective inference": post-hoc bounds on the false discovery proportion , 2018, 1803.06790.

[3]  G. Casella,et al.  Confidence intervals for the means of the selected populations , 2018 .

[4]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[5]  R. Tibshirani,et al.  Selective Sequential Model Selection , 2015, 1512.02565.

[6]  Jonathan Taylor,et al.  Statistical learning and selective inference , 2015, Proceedings of the National Academy of Sciences.

[7]  E. Candès,et al.  Controlling the false discovery rate via knockoffs , 2014, 1404.5609.

[8]  M. Molina Arias,et al.  El problema de las comparaciones múltiples , 2014 .

[9]  Dennis L. Sun,et al.  Optimal Inference After Model Selection , 2014, 1410.2597.

[10]  Robert Tibshirani,et al.  Post‐selection point and interval estimation of signal sizes in Gaussian samples , 2014, 1405.3340.

[11]  Weidong Liu,et al.  Two‐sample test of high dimensional means under dependence , 2014 .

[12]  R. Tibshirani,et al.  Exact Post-Selection Inference for Sequential Regression Procedures , 2014, 1401.3889.

[13]  Dennis L. Sun,et al.  Exact post-selection inference, with application to the lasso , 2013, 1311.6238.

[14]  A. Buja,et al.  Valid post-selection inference , 2013, 1306.1059.

[15]  Y. Benjamini,et al.  Selection Adjusted Confidence Intervals With More Power to Determine the Sign , 2013 .

[16]  J. Neher A problem of multiple comparisons , 2011 .

[17]  H. Keselman,et al.  Multiple Comparison Procedures , 2005 .

[18]  Ali Esmaili,et al.  Probability and Random Processes , 2005, Technometrics.

[19]  Peter R. Nelson,et al.  Multiple Comparisons: Theory and Methods , 1997 .

[20]  K. Young 4. The Collected Works of John W. Tukey: Vol. VIII, Multiple Comparisons: 1948 , 1995 .

[21]  William J. Thompson,et al.  The collected works of john w. tukey , 1991 .

[22]  A. Tamhane,et al.  Multiple Comparison Procedures. , 1989 .

[23]  J. Hsu Simultaneous Confidence Intervals for all Distances from the "Best" , 1981 .