AIC identifies optimal representation of longitudinal dietary variables

OBJECTIVES The Akaike Information Criterion (AIC) is a well-known tool for variable selection in multivariable modeling as well as a tool to help identify the optimal representation of explanatory variables. However, it has been discussed infrequently in the dental literature. The purpose of this paper is to demonstrate the use of AIC in determining the optimal representation of dietary variables in a longitudinal dental study. METHODS The Iowa Fluoride Study enrolled children at birth and dental examinations were conducted at ages 5, 9, 13, and 17. Decayed or filled surfaces (DFS) trend clusters were created based on age 13 DFS counts and age 13-17 DFS increments. Dietary intake data (water, milk, 100 percent-juice, and sugar sweetened beverages) were collected semiannually using a food frequency questionnaire. Multinomial logistic regression models were fit to predict DFS cluster membership (n=344). Multiple approaches could be used to represent the dietary data including averaging across all collected surveys or over different shorter time periods to capture age-specific trends or using the individual time points of dietary data. RESULTS AIC helped identify the optimal representation. Averaging data for all four dietary variables for the whole period from age 9.0 to 17.0 provided a better representation in the multivariable full model (AIC=745.0) compared to other methods assessed in full models (AICs=750.6 for age 9 and 9-13 increment dietary measurements and AIC=762.3 for age 9, 13, and 17 individual measurements). The results illustrate that AIC can help researchers identify the optimal way to summarize information for inclusion in a statistical model. CONCLUSIONS The method presented here can be used by researchers performing statistical modeling in dental research. This method provides an alternative approach for assessing the propriety of variable representation to significance-based procedures, which could potentially lead to improved research in the dental community.

[1]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[2]  Regina Nuzzo,et al.  Scientific method: Statistical errors , 2014, Nature.

[3]  J. Cavanaugh,et al.  Factors associated with surface-level caries incidence in children aged 9 to 13: the Iowa Fluoride Study. , 2013, Journal of public health dentistry.

[4]  A. Must,et al.  Dietary intake and severe early childhood caries in low-income, young children. , 2013, Journal of the Academy of Nutrition and Dietetics.

[5]  R. Kent,et al.  Diet and Caries-associated Bacteria in Severe Early Childhood Caries , 2010, Journal of dental research.

[6]  S. Levy,et al.  Longitudinal study of non-cavitated carious lesion progression in the primary dentition. , 2006, Journal of public health dentistry.

[7]  David R. Anderson,et al.  Multimodel Inference , 2004 .

[8]  S. Levy,et al.  Patterns of fluoride intake from 36 to 72 months of age. , 2003, Journal of public health dentistry.

[9]  Phyllis J Stumbo,et al.  Dental caries and beverage consumption in young children. , 2003, Pediatrics.

[10]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[11]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[12]  S. Levy,et al.  Dental caries in the primary dentition: assessing prevalence of cavitated and noncavitated lesions. , 2002, Journal of public health dentistry.

[13]  S. Levy,et al.  Patterns of fluoride intake from birth to 36 months. , 2001, Journal of public health dentistry.

[14]  J. Cavanaugh A large-sample model selection criterion based on Kullback's symmetric divergence , 1999 .

[15]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[18]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[19]  S. Levy,et al.  Comparison of the intakes of sugars by young children with and without dental caries experience. , 2007, Journal of the American Dental Association.

[20]  Ken P. Kleinman,et al.  A New Strategy of Model Building in Proc Logistic with Automatic Variable Selection , Validation , Shrinkage and Model Averaging , 2004 .

[21]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[22]  E. Shtatland,et al.  THE PERILS OF STEPWISE LOGISTIC REGRESSION AND HOW TO ESCAPE THEM USING INFORMATION CRITERIA AND THE OUTPUT DELIVERY SYSTEM , 2001 .

[23]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[24]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.