Testing correct model specification using extreme learning machines

Abstract Testing the correct model specification hypothesis for artificial neural network (ANN) models of the conditional mean is not standard. The traditional Wald, Lagrange multiplier, and quasi-likelihood ratio statistics weakly converge to functions of Gaussian processes, rather than to convenient chi-squared distributions. Also, their large-sample null distributions are problem dependent, limiting applicability. We overcome this challenge by applying functional regression methods of Cho et al. [8] to extreme learning machines (ELM). The Wald ELM (WELM) test statistic proposed here is easy to compute and has a large-sample standard chi-squared distribution under the null hypothesis of correct specification. We provide associated theory for time-series data and affirm our theory with some Monte Carlo experiments.

[1]  A. Wald Tests of statistical hypotheses concerning several parameters when the number of observations is large , 1943 .

[2]  Jin Seo Cho,et al.  Testing for Regime Switching , 2007 .

[3]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .

[4]  J. Cho,et al.  Testing for Unobserved Heterogeneity in Exponential and Weibull Duration Models , 2009 .

[5]  Halbert White,et al.  Approximate Nonlinear Forecasting Methods , 2006 .

[6]  Prem S. Puri,et al.  On Optimal Asymptotic Tests of Composite Statistical Hypotheses , 1967 .

[7]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[8]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[9]  R. Rao Relations between Weak and Uniform Convergence of Measures with Applications , 1962 .

[10]  E. Candès Ridgelets: estimating with ridge functions , 2003 .

[11]  Qinyu. Zhu Extreme Learning Machine , 2013 .

[12]  E. Giné,et al.  Bootstrapping General Empirical Measures , 1990 .

[13]  H. White Asymptotic theory for econometricians , 1985 .

[14]  Jerzy Neyman,et al.  ASYMPTOTICALLY OPTIMAL TESTS OF COMPOSITE HYPOTHESES FOR RANDOMIZED EXPERIMENTS WITH NONCONTROLLED PREDICTOR VARIABLES , 1965 .

[15]  Halbert White,et al.  Chapter 9 Approximate Nonlinear Forecasting Methods , 2006 .

[16]  C. Granger,et al.  Handbook of Economic Forecasting , 2006 .

[17]  Halbert White,et al.  Estimation, inference, and specification analysis , 1996 .

[18]  J. Cho,et al.  Testing for a Constant Mean Function using Functional Regression , 2009 .

[19]  H. White Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models , 1989 .

[20]  Halbert White,et al.  Artificial neural networks: an econometric perspective ∗ , 1994 .

[21]  M. Bartlett Periodogram analysis and continuous spectra. , 1950, Biometrika.

[22]  Maxwell B. Stinchcombe,et al.  CONSISTENT SPECIFICATION TESTING WITH NUISANCE PARAMETERS PRESENT ONLY UNDER THE ALTERNATIVE , 1998, Econometric Theory.

[23]  H. White,et al.  A Unified Theory of Estimation and Inference for Nonlinear Dynamic Models , 1988 .

[24]  M. Bartlett,et al.  APPROXIMATE CONFIDENCE INTERVALS , 1953 .

[25]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[26]  Bruce E. Hansen,et al.  Inference When a Nuisance Parameter Is Not Identified under the Null Hypothesis , 1996 .

[27]  U. Grenander,et al.  Probability and Statistics: The Harald Cramer Volume. , 1960 .

[28]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[29]  Michael C. Mozer,et al.  Mathematical Perspectives on Neural Networks , 1996 .

[30]  Halbert White,et al.  Revisiting Tests for Neglected Nonlinearity Using Artificial Neural Networks , 2011, Neural Computation.

[31]  P. Massart,et al.  Invariance principles for absolutely regular empirical processes , 1995 .

[32]  R. Davies Hypothesis testing when a nuisance parameter is present only under the alternative , 1977 .

[33]  Herman J. Bierens,et al.  A consistent conditional moment test of functional form , 1990 .

[34]  Bruce E. Hansen,et al.  Interval forecasts and parameter uncertainty , 2006 .

[35]  H. White,et al.  An additional hidden unit test for neglected nonlinearity in multilayer feedforward networks , 1989, International 1989 Joint Conference on Neural Networks.

[36]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .

[37]  P. McCullagh Tensor Methods in Statistics , 1987 .

[38]  U. Grenander,et al.  Probability and Statistics: The Harald Cramer Volume , 1961 .

[39]  R. Bass,et al.  Review: P. Billingsley, Convergence of probability measures , 1971 .

[40]  I. G. Žhurbenko,et al.  The power of the optimal asymptotic tests of composite statistical hypotheses. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[42]  Winfried Stute,et al.  Distribution free specification tests of conditional models , 2008 .

[43]  Jin Seo Cho,et al.  Generalized runs tests for the IID hypothesis , 2011 .

[44]  H. White Nonparametric Estimation of Conditional Quantiles Using Neural Networks , 1990 .

[45]  Halbert White,et al.  Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings , 1990, Neural Networks.

[46]  M. Bartlett,et al.  APPROXIMATE CONFIDENCE INTERVALSMORE THAN ONE UNKNOWN PARAMETER , 1953 .

[47]  Clive W. J. Granger,et al.  Testing for neglected nonlinearity in time series models: A comparison of neural network methods and alternative tests , 1993 .