Machine Leaning Algorithms and Time Series Feature Extraction Library for Electricity Consumption Fraud Detection in Smart Grids
暂无分享,去创建一个
Smart meters read the consumption at different time resolutions and may generate large volumes of time series that require special tools for consumption monitoring and fraud detection. Usually, the readings of the smart meter have numerous null values and outliers that impact the results of fraud detection. Feature extraction from time series is a challenge especially when patterns and irregularities have to be identified. Therefore, we propose to implement ten Machine Learning (ML) supervised algorithms with the very recent Python library - TSFEL that stands for Time Series Feature Extraction Library and automatically extracts time series and over 60 features from statistical (such as: mean absolute deviation, variance, interquartile range), temporal (such as: autocorrelation, mean absolute differences, entropy, peak to peak distance) and spectral (such as: FFT mean coefficient, wavelet absolute mean, standard deviation, spectral distance, fundamental frequency) perspective. Two algorithms, Multi-Layer Perceptron and Light Gradient Boost, provide very good results in identifying suspicious consumers on a real consumption dataset recorded in China by the utility company State Grid Corporation of China. The performance of TSFEL and ML algorithms is compared with the case without feature engineering. A data processing methodology is proposed for data processing including several significant stages before training the model.