We have finished implementing the following modules:
Module 2, Phase 1: Introducing basic fraud techniques such as ballot stuffing and faulty vote machines.
Module 3, Pase 1: We have implemented the Bedford test which tells us wether the resulting counts aggregated to the county level resemble the Benford’s probability distribution of the second digits. If the Benford statistic for a specific candidate produces a p-value lower than 5% then the counts for such candidate are said to be untrusted.
We have encounter a problem where even though our simulated election on one state is not tampered with, the counts for one candidate don’t pass the Benford test. The reason is because the kind of complexity that can produce counts with digits that follow Benford’s Law refers to processes that are statistical mixtures (e.g., Janvresse and de la Rue (2004)), which means that random portions of the data come from different statistical distributions. So the way we are randomly assigning votes to each of the candidates needs to be rethought. There are some limits that apply to the extent of the mixing, however. If the number of distinct distributions is large, then the result is likely to be well approximated by some simple random process that does not satisfy Benford’s Law. So if we are to believe that in general Benford’s Law should be expected to describe the digits in vote counts, we need to have a behaviorally realistic process that involves mixing among a small number of distributions.