Missing data filling method based on linear interpolation and lightgbm

In the context of the rapid development of integrated energy and the digital transformation of power grids, data is playing an increasingly important role in the safe operation of power grids. To deepen the value of data application and ensure the accuracy of data application, this paper proposes a data filling method that combines linear interpolation and LightGBM (Light Gradient Boosting Machine) in response to the missing phenomenon in the source network data collection process. The process can generally be divided into 2 steps: First, linear interpolation is exploited to process short-term missing data. Then the LightGBM can be used to process long-term missing data. In the process of using LightGBM, linear interpolation is used to interpolate the independent variables of the input model. Through the above process, the data for the missing ratio could be obtained, which can then be used to complete all data filling in order from high to low. Through actual data test, this method has better data filling performance.