ON THE PROBLEM OF FUNCTION APPROXIMATION BY SAMPLE SET PROCESSING FOR AN INCOMPLETELY DETERMINED MODEL

In mathematical statistics, the problem of function approximation is usually considered in the following way. One is given a finite system of functions, {+ i (x ) } , such that an unknown function to be approximated can be expressed as a linear combination of the functions 4i(x) plus a random error. The problem of estimating the unknown coefficients of the linear combination using a sample set has been well studied for different cases. However, in many practical problems, the system of functions (+,(x)} is not known completely, but only partially. We shall call such models incompletely determined. The model for predicting solar flares4 and some models for medical diagnosis are typical examples. As one considers such models, one finds some new problems that do not exist for completely determined models. Among them is the question, Should one use all available functions 4i(x) in the model or discard some of them? If the latter is preferable (and we shall see that it is) then what is the criterion of selection? If it is possible to use the complete model, is that always preferable to discarding some functions? We shall show that the use of an incompletely determined model sometimes leads to better results. The influence of the precision of the model and the experimental data on the selection criterion is also considered. The case with errors in both dependent and independent variables is especially discussed. A study of such problems was made by the author for a simplified and, in a sense, artificial model of function approximation in Reference 5. In this paper, the study is performed for the method of least squares, which is often used for function approximation in practice.