Using Definitive Screening Designs to Identify Active First- and Second-Order Factor Effects

Definitive screening designs (DSDs) were recently introduced by Jones and Nachtsheim (2011b). The use of three-level factors and the desirable aliasing structure of the DSDs make them potentially suitable for identifying main effects and second-order terms in one stage of experimentation. However, as the number of active effects approaches the number of runs, the performance of standard model-selection routines will inevitably degrade. In this paper, we characterize the ability of DSDs to correctly identify first- and second-order model terms as a function of the level of sparsity, the number of factors in the design, the signal-to-noise ratio, the model type (unrestricted or following strong heredity), the model-selection technique, and the number of augmented runs. We find that minimum-run-size DSDs can be used to identify active terms with high probability as long as the number of effects is less than or equal to about half the number of runs and the signal-to-noise ratios for the active effects are above about 2.0. We also find that if minimum-run-size designs are augmented with four or more runs, the number of model terms that can be identified with high probability increases substantially. Among the model-selection methods investigated, we found that both Lasso and the Gauss–Dantzig selector (both based on AICc) can be used to effectively identify active model terms in the presence of unrestricted models. For models following strong heredity, the SHIM method developed by Choi et al. (2010) was the best among methods tested that were designed for the strong-heredity case.

[1]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[2]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[3]  Christopher J. Nachtsheim,et al.  Definitive Screening Designs with Added Two-Level Categorical Factors* , 2013 .

[4]  R. Daniel Meyer,et al.  An Analysis for Unreplicated Fractional Factorials , 1986 .

[5]  Fengshan Bai,et al.  Constructing Definitive Screening Designs Using Conference Matrices , 2012 .

[6]  Runchu Zhang,et al.  A method for screening active effects in supersaturated designs , 2007 .

[7]  William Li,et al.  Benefits and Fast Construction of Efficient Two-Level Foldover Designs , 2017, Technometrics.

[8]  Christopher J. Nachtsheim,et al.  Efficient Designs With Minimal Aliasing , 2011, Technometrics.

[9]  Angela M. Dean,et al.  Screening Strategies in the Presence of Interactions , 2014, Technometrics.

[10]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[11]  Dennis K. J. Lin,et al.  FORWARD SELECTION ERROR CONTROL IN THE ANALYSIS OF SUPERSATURATED DESIGNS , 1998 .

[12]  C. L. Mallows Some comments on C_p , 1973 .

[13]  Douglas C. Montgomery,et al.  Analysis of Supersaturated Designs , 2003 .

[14]  Ji Zhu,et al.  Variable Selection With the Strong Heredity Constraint and Its Oracle Property , 2010 .

[15]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[16]  Xi Wu,et al.  A Strategy of Searching Active Factors in Supersaturated Screening Experiments , 2004 .

[17]  Christopher J. Nachtsheim,et al.  Blocking Schemes for Definitive Screening Designs , 2016, Technometrics.

[18]  Dennis K. J. Lin,et al.  Data analysis in supersaturated designs , 2002 .

[19]  Christopher J. Nachtsheim,et al.  A Class of Three-Level Designs for Definitive Screening in the Presence of Second-Order Effects , 2011 .

[20]  Christopher J. Marley,et al.  A comparison of design and model selection methods for supersaturated experiments , 2010, Comput. Stat. Data Anal..

[21]  Xiang Li,et al.  Regularities in data from factorial experiments , 2006, Complex..

[22]  A Miller,et al.  Using Folded-Over Nonorthogonal Designs , 2005, Technometrics.

[23]  Changbao Wu,et al.  Construction of supersaturated designs through partially aliased interactions , 1993 .

[24]  Runze Li,et al.  Analysis Methods for Supersaturated Design: Some Comparisons , 2003, Journal of Data Science.

[25]  Dennis K. J. Lin,et al.  A Two-Stage Bayesian Model Selection Strategy for Supersaturated Designs , 2002, Technometrics.

[26]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[27]  Yi Lin,et al.  An Efficient Variable Selection Approach for Analyzing Designed Experiments , 2007, Technometrics.

[28]  Frederick Kin Hing Phoa,et al.  Analysis of Supersaturated Designs via Dantzig Selector , 2009 .

[29]  H. Chipman,et al.  A Bayesian variable-selection approach for analyzing designed experiments with complex aliasing , 1997 .

[30]  Changbao Wu,et al.  Analysis of Designed Experiments with Complex Aliasing , 1992 .

[31]  Dennis K. J. Lin,et al.  A new class of supersaturated designs , 1993 .

[32]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[33]  Ji Zhu,et al.  Comment: Model Selection With Strong and Weak Heredity Constraints , 2014, Technometrics.

[34]  C. F. Jeff Wu,et al.  Experiments , 2021, Wiley Series in Probability and Statistics.

[35]  A. Atkinson Subset Selection in Regression , 1992 .