Design of a Database-Driven Modeling Based on Variable Selection Using a Random Forest

Lots of systems in the industries have nonlinearity and those structures are generally complicated. Therefore, it is difficult to express the system as mathematical model. As one approach for this problem, database-driven modeling (DDM) method which is a kind of Just-In-Time(JIT) modeling has been proposed as a method to construct a non-linear model for controlling. However, DDM method has a problem that modeling accuracy deteriorates in a complicated system including many variables not related to output. This study introduces the variable evaluation/selection method based on the Random Forest(RF) to improve the accuracy of DDM method. RF can quantify the degree of contribution to variable prediction as importance. The effectiveness of the proposed scheme is numerically verified by a simulation example.