Using Technologies of OLAP and Machine Learning for Validation of the Numerical Models of Convective Clouds

The paper is a continuation of the works [1, 2, 3] where complex information system for organization of the input data for the models of convective clouds is presented. In the present work we use the information system for obtaining statistically significant amount of meteorological data about the state of the atmosphere in the place and at the time when dangerous convective phenomena are recorded. Corresponding amount of information has been collected about the state of the atmosphere in cases when no dangerous convective phenomena have been observed. Feature selection for thunderstorm forecasting based on Recursive feature elimination with cross-validation algorithm is provided. Three methods of machine learning: Support Vector Machine, Logistic Regression and Ridge Regression are used for making the decision on whether or not a dangerous convective phenomenon occurs at present atmospheric conditions. The OLAP technology is used for development of the concept of multidimensional data base intended for distinguishing the types of the phenomena (thunderstorm, heavy rainfall and light rain).