Novelty Detection for Election Fraud: A Case Study with Agent-Based Simulation Data

In this paper, we propose a robust election simulation model and independently developed election anomaly detection algorithm that demonstrates the simulation’s utility. The simulation generates artificial elections with similar properties and trends as elections from the real world, while giving users control and knowledge over all the important components of the elections. We generate a clean election results dataset without fraud as well as datasets with varying degrees of fraud. We then mea-sure how well the algorithm is able to successfully detect the level of fraud present. The algorithm determines how similar actual election results are as compared to the predicted results from polling and a regression model of other regions that have similar demographics. We use k-means to partition electoral regions into clusters such that demographic homogeneity is maximized among clusters. We then use a novelty detection algorithm implemented as a one-class Support Vector Machine where the clean data is provided in the form of polling predic- tions and regression predictions. The regression predictions are built from the actual data in such a way that the data super- vises itself. We show both the effectiveness of the simulation technique and the machine learning model in its success in identifying fraudulent regions.