A combined first-principles and data-driven approach to model building

Abstract We address a central theme of empirical model building: the incorporation of first-principles information in a data-driven model-building process. By enabling modelers to leverage all available information, regression models can be constructed using measured data along with theory-driven knowledge of response variable bounds, thermodynamic limitations, boundary conditions, and other aspects of system knowledge. We expand the inclusion of regression constraints beyond intra-parameter relationships to relationships between combinations of predictors and response variables. Since the functional form of these constraints is more intuitive, they can be used to reveal hidden relationships between regression parameters that are not directly available to the modeler. First, we describe classes of a priori modeling constraints. Next, we propose a semi-infinite programming approach for the incorporation of these novel constraints. Finally, we detail several application areas and provide extensive computational results.

[1]  Nikolaos V. Sahinidis,et al.  A polyhedral branch-and-cut approach to global optimization , 2005, Math. Program..

[2]  A. S. Korkhin Parameter estimation accuracy for nonlinear regression with nonlinear constraints , 1998 .

[3]  Pavel S. Knopov,et al.  Regression Analysis Under A Priori Parameter Restrictions , 2011 .

[4]  Berç Rustem,et al.  Semi-Infinite Programming and Applications to Minimax Problems , 2003, Ann. Oper. Res..

[5]  N. Sahinidis,et al.  Steady‐state process optimization with guaranteed robust stability under parametric uncertainty , 2011 .

[6]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .

[7]  Clifford M. Hurvich,et al.  A CORRECTED AKAIKE INFORMATION CRITERION FOR VECTOR AUTOREGRESSIVE MODEL SELECTION , 1993 .

[8]  Gary C. McDonald,et al.  Constrained Regression Estimates of Technology Effects on Fuel Economy , 1999 .

[9]  Nikolaos V. Sahinidis,et al.  Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..

[10]  G. Judge,et al.  Inequality Restrictions in Regression Analysis , 1966 .

[11]  David C. Miller,et al.  Learning surrogate models for simulation‐based optimization , 2014 .

[12]  A. S. Korkhin Estimation Accuracy of Linear Regression Parameters with Regard for Inequalitiy Constraints Based on a Truncated Matrix of Mean Square Errors of Parameter Estimates , 2002 .

[13]  Calyampudi Radhakrishna Rao,et al.  Linear Statistical Inference and its Applications , 1967 .

[14]  Rembert Reemtsen,et al.  Numerical Methods for Semi-Infinite Programming: A Survey , 1998 .

[15]  Timothy W. Simpson,et al.  Metamodels for Computer-based Engineering Design: Survey and recommendations , 2001, Engineering with Computers.

[16]  Marco A. López,et al.  Linear semi-infinite programming theory: An updated survey , 2002, Eur. J. Oper. Res..

[17]  T. Brubaker,et al.  Nonlinear Parameter Estimation , 1979 .

[18]  O. Nelles Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models , 2000 .

[19]  R. Reemtsen,et al.  Semi‐Infinite Programming , 1998 .

[20]  C. K. Liew,et al.  Inequality Constrained Least-Squares Estimation , 1976 .

[21]  F. John Extremum Problems with Inequalities as Subsidiary Conditions , 2014 .

[22]  A. S. Korkhin,et al.  Using a priori information in regression analysis , 2013 .

[23]  Kenneth O. Kortanek,et al.  Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..

[24]  A. S. Korkhin,et al.  Determining Sample Characteristics and Their Asymptotic Linear-Regression Properties Estimated Using Inequality Constraints , 2005 .

[25]  A. S. Korkhin,et al.  Certain properties of the estimates of the regression parameters under a priori constraint-inequalities , 1985 .

[26]  R. Pearson,et al.  Gray-box identification of block-oriented nonlinear models , 2000 .

[27]  Michael Thomson,et al.  Some results on the statistical properties of an inequality constrained least squares estimator in a linear model with two regressors , 1982 .

[28]  Elizabeth A. Peck,et al.  Introduction to Linear Regression Analysis , 2001 .