Power analysis for the bootstrap likelihood ratio test for the number of classes in latent class models

Latent class (LC) analysis is used to construct empirical evidence on the existence of latent subgroups based on the associations among a set of observed discrete variables. One of the tests used to infer about the number of underlying subgroups is the bootstrap likelihood ratio test (BLRT). Although power analysis is rarely conducted for this test, it is important to identify, clarify, and specify the design issues that influence the statistical inference on the number of latent classes based on the BLRT. This paper proposes a computationally efficient ‘short-cut’ method to evaluate the power of the BLRT, as well as presents a procedure to determine a required sample size to attain a specific power level. Results of our numerical study showed that this short-cut method yields reliable estimates of the power of the BLRT. The numerical study also showed that the sample size required to achieve a specified power level depends on various factors of which the class separation plays a dominant role. In some situations, a sample size of 200 may be enough, while in others 2000 or more subjects are required to achieve the required power.

[1]  Daniel L. Oberski Beyond the number of classes: separating substantive from non-substantive dependence in latent class analysis , 2015, Advances in Data Analysis and Classification.

[2]  H. Akaike A new look at the statistical model identification , 1974 .

[3]  Jay Magidson,et al.  LG-Syntax user's guide: Manual for Latent GOLD 4.5 Syntax module , 2008 .

[4]  J. Wolfe PATTERN CLUSTERING BY MULTIVARIATE MIXTURE ANALYSIS. , 1970, Multivariate behavioral research.

[5]  Neil Henry Latent structure analysis , 1969 .

[6]  J. MacKinnon,et al.  The power of bootstrap and asymptotic tests , 2006 .

[7]  T. Crow,et al.  Beyond symptom dimensions: Schizophrenia risk factors for patient groups derived by latent class analysis , 2009, Schizophrenia Research.

[8]  Steven G. Self,et al.  Power Calculations for Likelihood Ratio Tests in Generalized Linear Models , 1992 .

[9]  J. Hartigan Distribution Problems in Clustering , 1977 .

[10]  J. Pannekoek,et al.  Bootstrapping Goodness-of-Fit Measures in Categorical Data Analysis , 1996 .

[11]  David Rindskopf,et al.  THE USE OF LATENT CLASS ANALYSIS IN MEDICAL DIAGNOSIS , 2002 .

[12]  G. McLachlan On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture , 1987 .

[13]  V. Johnson,et al.  On the use of non‐local prior densities in Bayesian hypothesis tests , 2010 .

[14]  Rajendra K. Srivastava,et al.  Inferring Market Structure with Aggregate Data: A Latent Segment Logit Approach , 1993 .

[15]  Raul Cano On The Bayesian Bootstrap , 1992 .

[16]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[17]  Stephanie T. Lanza,et al.  Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences , 2009 .

[18]  J. Vermunt Latent Class Models , 2004 .

[19]  Ewa Genge,et al.  A latent class analysis of the public attitude towards the euro adoption in Poland , 2014, Adv. Data Anal. Classif..

[20]  J. Dessens,et al.  A parametric bootstrap procedure to perform statistical tests in latent class analysis , 1996 .

[21]  Paul F. Lazarsfeld,et al.  Latent Structure Analysis. , 1969 .

[22]  A. Shapiro,et al.  On the multivariate asymptotic distribution of sequential Chi-square statistics , 1985 .

[23]  Hans-Hermann Bock,et al.  Probabilistic Models in Partitional Cluster Analysis , 2003 .

[24]  Ab Mooijaart,et al.  Type I errors and power of the parametric bootstrap goodness-of-fit test: full and limited information. , 2003, The British journal of mathematical and statistical psychology.

[25]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[26]  George B. Macready,et al.  A Simulation Study of the Difference Chi-Square Statistic for Comparing Latent Class Models Under Violation of Regularity Conditions , 1989 .

[27]  B. Muthén,et al.  Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study , 2007 .

[28]  D. Rindskopf,et al.  The value of latent class analysis in medical diagnosis. , 1986, Statistics in medicine.

[29]  Martijn P. F. Berger,et al.  Maximin D-optimal designs for binary longitudinal responses , 2008, Comput. Stat. Data Anal..

[30]  B. Everitt A Monte Carlo Investigation Of The Likelihood Ratio Test For The Number Of Components In A Mixture Of Normal Distributions. , 1981, Multivariate behavioral research.

[31]  D. Rubin,et al.  Testing the number of components in a normal mixture , 2001 .

[32]  D. Rubin The Bayesian Bootstrap , 1981 .

[33]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[34]  A. Shapiro Asymptotic distribution of test statistics in the analysis of moment structures under inequality constraints , 1985 .

[35]  Neal O. Jeffries A note on 'Testing the number of components in a normal mixture' , 2003 .

[36]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[37]  Jeroen K. Vermunt,et al.  Latent Class Modeling with Covariates: Two Improved Three-Step Approaches , 2010, Political Analysis.

[38]  José G. Dias,et al.  Latent class modeling of website users’ search patterns: Implications for online market segmentation , 2007 .