Our approach is based on two basic methods. The first one, so-called weighted average assumes the averaging of openly available survey results. As you may notice, the data is not comprehensive due to the imperfections of the sources used in our calculations. With rare exceptions, the authors of the surveys seldom publish the methodology, the wording of the questions, and the raw data. The method of forecasting using the weighted average is pretty simple: only the answers indicating the support for a particular political party are taken into account. For instance, the proportion of undecided voters in a poll maybe 50%, but when calculating, we ignore them by assuming that voting preferences among undecided voters will be distributed proportionally similar to decided ones. As a result, we calculate the percentage only among decided voters. Clearly, this assumption may not always be valid - a good example of this is the 2012 Georgian parliamentary elections when a large majority of undecided voters from the pre-electoral polls voted to the opposition coalition.
Yet another feature of this approach is the timing of the poll. When calculating the weighted average recently conducted studies receive higher, while older ones are given less weight. We can assume that the closer the survey is to voting day, the more accurately it reflects the public mood.
It should be noted that the average results of the polls are widely used for election forecasting. For example, the well respected and acclaimed electoral portal RealClearPolitics publishes the forecasts of presidential elections and generic ballots using the averaged results of public opinion polls conducted in the United States on a regular basis.
Another method applied for election predictions is statistical modeling. We are using so-called state space statistical mode that was utilized by James Savage for the 2016 US presidential elections. The key idea behind state-space modeling is that polls only partially reflect the party affiliations, while the real value can be found only in the variables that are in the unobserved state. On the one hand, we try to find the latent meaning of the variable through the so-called “Noise” observed at the specific point of time. At the same time, we try to aggregate the meaning of those observations over time. By nature, the state space models are Bayesian. This implies the existence of a priory probability, i.e. initial condition. At the same time, these pre-probabilities are updated through other subsequent observations.
Our forecasts are grounded on the 2000 simulations and one should interpret the data based distributions of those simulations. The model makes it possible to “hold” 2000 elections using the survey results. Depending on the distributions of the raw data, the algorithm calculates the potential outcomes for each simulation. The subsequent phase of the analysis is the observation of the distribution of the data. The most anticipated result is the median value. The kurtosis of the histogram makes it possible to detect the accuracy of the estimate.