An Empirical Comparison of Information-Theoretic Selection Criteria for Multivariate Behavior Genetic Models

Information theory provides an attractive basis for statistical inference and model selection. However, little is known about the relative performance of different information-theoretic criteria in covariance structure modeling, especially in behavioral genetic contexts. To explore these issues, information-theoretic fit criteria were compared with regard to their ability to discriminate between multivariate behavioral genetic models under various model, distribution, and sample size conditions. Results indicate that performance depends on sample size, model complexity, and distributional specification. The Bayesian Information Criterion (BIC) is more robust to distributional misspecification than Akaike's Information Criterion (AIC) under certain conditions, and outperforms AIC in larger samples and when comparing more complex models. An approximation to the Minimum Description Length (MDL; Rissanen, J. (1996). IEEE Transactions on Information Theory 42:40–47, Rissanen, J. (2001). IEEE Transactions on Information Theory 47:1712–1717) criterion, involving the empirical Fisher information matrix, exhibits variable patterns of performance due to the complexity of estimating Fisher information matrices. Results indicate that a relatively new information-theoretic criterion, Draper's Information Criterion (DIC; Draper, 1995), which shares features of the Bayesian and MDL criteria, performs similarly to or better than BIC. Results emphasize the importance of further research into theory and computation of information-theoretic criteria.

[1]  Michael C. Neale,et al.  Methodology for Genetic Studies of Twins and Families , 1992 .

[2]  Adrian E. Raftery,et al.  Bayesian Model Selection in Structural Equation Models , 1992 .

[3]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[4]  Andrew R. Barron,et al.  Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.

[5]  Y. Fujikoshi,et al.  Modified AIC and Cp in multivariate linear regression , 1997 .

[6]  D I Boomsma,et al.  The genetic analysis of repeated measures. I. Simplex models , 1987, Behavior genetics.

[7]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[8]  G. Celeux,et al.  An entropy criterion for assessing the number of clusters in a mixture model , 1996 .

[9]  D. Pauler The Schwarz criterion and related methods for normal linear models , 1998 .

[10]  Clifford M. Hurvich,et al.  The impact of model selection on inference in linear regression , 1990 .

[11]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[12]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[13]  Jorma Rissanen,et al.  Strong optimality of the normalized ML models as universal codes and information in data , 2001, IEEE Trans. Inf. Theory.

[14]  Peter J. Bickel,et al.  Variable selection in nonparametric regression with categorical covariates , 1992 .

[15]  Sadanori Konishi,et al.  Model evaluation and information criteria in covariance structure analysis , 1999 .

[16]  Y. L. Tong The multivariate normal distribution , 1989 .

[17]  Ping Zhang On the convergence rate of model selection criteria , 1993 .

[18]  D. Duffy,et al.  Genetics of asthma and hay fever in Australian twins. , 1990, The American review of respiratory disease.

[19]  C. Mitchell Dayton,et al.  Model Selection Information Criteria for Non-Nested Latent Class Models , 1997 .

[20]  Bin Yu,et al.  Model Selection and the Principle of Minimum Description Length , 2001 .

[21]  David R. Anderson,et al.  Model selection and inference : a practical information-theoretic approach , 2000 .

[22]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[23]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[24]  N. Sugiura Further analysts of the data by akaike' s information criterion and the finite corrections , 1978 .

[25]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[26]  A. Azzalini,et al.  The multivariate skew-normal distribution , 1996 .

[27]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[28]  I. J. Myung,et al.  Counting probability distributions: Differential geometry and model selection , 2000, Proc. Natl. Acad. Sci. USA.

[29]  Masanori Ichikawa EMPIRICAL ASSESSMENTS OF AIC PROCEDURE FOR MODEL SELECTION IN FACTOR ANALYSIS , 1988 .

[30]  Hagit Messer,et al.  Submitted to Ieee Transactions on Signal Processing Detection of Signals by Information Theoretic Criteria: General Asymptotic Performance Analysis , 2022 .

[31]  A. Azzalini,et al.  Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t‐distribution , 2003, 0911.2342.

[32]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[33]  A. Pickles,et al.  An Evaluation of Different Approaches for Behavior Genetic Analyses with Psychiatric Symptom Scores , 2000, Behavior genetics.

[34]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..