PARALLEL METHOD OF BIG DATA REDUCTION BASED ON STOCHASTIC PROGRAMMING APPROACH

Context. The task of automation of big data reduction in diagnostics and pattern recognition problems is solved. The object of the research is the process of big data reduction. The subject of the research are the methods of big data reduction. Objective. The research objective is to develop parallel method of big data reduction based on stochastic calculations. Method. The parallel method of big data reduction is proposed. This method is based on the proposed criteria system, which allows to estimate concentration of control points around local extrema. Calculation of solution concentration estimates in the developed criteria system is based on the spatial location of control points in the current solution set. The proposed criteria system can be used in stochastic search methods to monitor situations of excessive solution concentration in the areas of local optima and, as a consequence, to increase the diversity of the solution set in the current population and to cover the search space by control points in a more uniform way during optimization process. Results. The software which implements the proposed parallel method of big data reduction and allows to select informative features and to reduce the big data for synthesis of recognition models based on the given data samples has been developed. Conclusions. The conducted experiments have confirmed operability of the proposed parallel method of big data reduction and allow to recommend it for processing of data sets for pattern recognition in practice. The prospects for further researches may include the modification of the known feature selection methods and the development of new ones based on the proposed system of criteria for control points concentration estimation.