In this paper, we propose a robust election simulation model and independently developed election anomaly detection algorithm that demonstrates the simulation’s utility. The simulation generates artificial elections with similar properties and trends as elections from the real world, while giving users control and knowledge over all the important components of the elections. We generate a clean election results dataset without fraud as well as datasets with varying degrees of fraud. We then mea-sure how well the algorithm is able to successfully detect the level of fraud present. The algorithm determines how similar actual election results are as compared to the predicted results from polling and a regression model of other regions that have similar demographics. We use k-means to partition electoral regions into clusters such that demographic homogeneity is maximized among clusters. We then use a novelty detection algorithm implemented as a one-class Support Vector Machine where the clean data is provided in the form of polling predic- tions and regression predictions. The regression predictions are built from the actual data in such a way that the data super- vises itself. We show both the effectiveness of the simulation technique and the machine learning model in its success in identifying fraudulent regions.
[1]
D. Nazzal,et al.
An Unsupervised Density Based Clustering Algorithm to Detect Election Anomalies : Evidence from Georgia’s Largest County
,
2022,
COMPASS.
[2]
Mali Zhang,et al.
Election forensics: Using machine learning and synthetic data for possible election anomaly detection
,
2019,
PloS one.
[3]
Francisco Cantú.
The Fingerprints of Fraud: Evidence from Mexico’s 1988 Presidential Election
,
2019,
American Political Science Review.
[4]
R. Michael Alvarez,et al.
Using Machine Learning Algorithms to Detect Election Fraud
,
2016,
Computational Social Science.
[5]
Peter C. Ordeshook,et al.
Benford's Law and the Detection of Election Fraud
,
2011,
Political Analysis.
[6]
Susumu Shikano,et al.
When Does the Second-Digit Benford’s Law-Test Signal an Election Fraud?
,
2011
.
[7]
Bernhard Schölkopf,et al.
Estimating the Support of a High-Dimensional Distribution
,
2001,
Neural Computation.