Research Article | Open Access

# The Convergence Coefficient across Political Systems

**Academic Editor:**J. Pacheco

#### Abstract

Formal work on the electoral model often suggests that parties or candidates should locate themselves
at *the electoral mean*. Recent research has found no evidence of such convergence. In order to explain nonconvergence, the stochastic electoral model is extended by including estimates of electoral valence. We introduce the notion of a convergence coefficient, . It has been shown that high values of imply that there is a significant centrifugal tendency acting on parties. We used electoral surveys to construct a stochastic valence model of the the elections in various countries. We find that the convergence coefficient varies across elections in a country, across countries with similar regimes, and across political regimes. In some countries, the centripetal tendency leads parties to converge to the electoral mean. In others the centrifugal tendency dominates and some parties locate far from the electoral mean. In particular, for countries with proportional electoral systems, namely, Israel, Turkey, and Poland, the centrifugal tendency is very high. In the majoritarian polities of the United States and Great Britain, the centrifugal tendency is very low. In anocracies, the autocrat imposes limitations on how far from the origin the opposition parties can move.

#### 1. Introduction

Work on modeling elections has often assumed that the policy space was restricted to one dimension or that there were at most two parties [1, 2]. The extensive formal literature on electoral competition has typically been based on the assumption that parties or candidates adopt positions in order to win and has inferred that parties will converge to the electoral median, under deterministic voting in one dimension, or to the electoral mean in stochastic models.

In this paper we offer a formal stochastic model of elections that emphasizes the importance of the idea of *valence* and use this notion to provide an explanation of why vote maximizing political leaders in some countries will not adopt convergent policy positions at the mean of the electoral distribution. In the standard spatial model, candidate *positions* matter to voters. However, as Stokes [3, 4] has emphasized, the nonpolicy evaluations, or *valences*, of candidates by the electorate are just as important. (See also Clarke et al. [5, 6], Scotto et al. [7], and Clarke et al. [8].)

The main objective of this paper is to examine whether parties locate close to or far from the electoral mean (the electoral mean is the mean of voters ideal policies dimension by dimension) of various countries. We use Schofield’s [9] stochastic electoral model as a unifying framework that allows us to compare parties positions across different political systems. In this model, parties respond to their partisan constituencies after taking into account the anticipated electoral outcome and the positions of other parties. Voters decisions depend on parties’ locations and on the party’s valence, the voters’ *overall common* evaluation of the ability of a party leader to provide good governance. Using this model we examine whether parties converge to electoral mean in several elections in various countries under different political systems and use convergence, or the lack thereof, to classify political systems.

To examine whether parties converge to the electoral mean in each country in a particular election, we test whether any party has an incentive to stay or move away from the electoral mean to increase its vote share. In formal voting theory, it is usual to define a “Nash equilibrium” as a vector of party positions with the property that no party may make a unilateral move so as to increase its vote share. We use a variant of this concept, that is a “local Nash equilibrium” (LNE) where we consider only *marginal* moves from the position. One of the standard results in formal theory is the *mean voter theorem*, where the “Nash equilibrium” of a spatial voting game under vote maximization is one where all parties position themselves at the electoral mean. (For variants of the theorem see Enelow and Hinich [10–12]). We call such a vector the *electoral mean*.

To study each party’s best response to the electoral situation they face, we use the results presented in Schofield [9]. Schofield identifies a *convergence coefficient*, denoted , whose value determines whether vote maximizing parties converge or not to the electoral mean. This coefficient depends on various parameters of the model. In particular it depends on the *competence valences* of the party leaders. Using to denote the parties, the valence of party , , essentially measures the electoral perception of the “quality” of , the voters’ *overall common* evaluation of the ability of ’s leader to provide good governance. The valence terms, , are assumed to be independent of the party’s positions and can be estimated as the intercept term in the appropriate stochastic model of the voter utility function. As Sanders et al. [13] comment, valence theory is based on the assumption that “voters maximize their utilities by choosing the party that is best able to deliver policy success.” These valence terms measure the *bias* in favor of one another of the party leaders [14].

The convergence coefficient, , also depends on the weight that voters give to the policy differences they have with the various parties . Lastly, depends on the variance/covariance matrix of the electoral distribution, . By its construction, is dimensionless and thus independent of the units of measurement of the various parameters. We use the convergence coefficient to compare results across elections and countries and to classify political systems.

The convergence coefficient is a summary measure that provides an estimate of the centrifugal or centripetal forces acting on the parties. The *Valence Theorem*, presented in Section 2 (see Schofield [9] for the proof of this result) shows that if the policy space is two-dimensional and if , then the sufficient condition for convergence to the mean has been met and the “local Nash Equilibrium” (LNE) (the set of such local Nash Equilibria contains the set of Nash Equilibria) is one where all parties locate at the electoral mean. On the other hand, if is the dimension of the policy space and , then the LNE, if it exists, will be one where at least one party will have an incentive to diverge from the electoral mean in order to maximize its vote share. Thus, the necessary condition for convergence to the mean is that .

In essence, a high empirical convergence coefficient of an election is a convenient measure of the electoral incentive of a small, or low valence, party to move away from the electoral mean to its core constituency position. We can interpret a high value of the convergence coefficient as a measure of the *centrifugal* tendency exerted on parties pulling them away from the electoral mean. The convergence coefficient is therefore a convenient, simple, and intuitive way to examine whether parties will have an incentive to locate close to, or far from, the electoral mean. We will show that there is a strong connection between the values of the convergence coefficient and the nature of the political system under which parties operate.

We used preelection polls to study elections in several countries operating under different political regimes. The factor analysis done on preelection surveys showed that for all elections the policy space was two-dimensional, except in Azerbaijan were it was one-dimensional. The position of voters along this two-dimensional space were then estimated and their voting intentions used to estimate party positions. We then ran a multinomial logit (MNL) model for the election using the estimated party and voter positions. The intercept of the MNL model gives the valences of each party/leader. Following Schofield’s [9] formal model, we rank parties according to their valence. Using these MNL estimates we calculate the convergence coefficient of the election and examine whether the party with the lowest valence has an incentive to locate close to or far from the electoral origin.

When comparing the convergence coefficients across countries we observe that in countries with proportional representation the convergence coefficient is high and that in countries with plurality systems or in anocracies it is low. Thus, suggesting that we can use the valence theorem and its associated convergence result to classify electoral systems.

The convergence coefficients for the 2005 and 2010 elections in the UK were not significantly different from 1, meeting the necessary condition for convergence to the mean. For the 2000, 2004, and 2008 US presidential elections, the convergence coefficient is significantly below 1 in 2000 and 2004 thus meeting the sufficient and thus necessary condition for convergence and not significantly different from 1 in 2008, only meeting the necessary condition for convergence. We suggest that the centrifugal tendency in the majoritarian polities like the United States and the United Kingdom is very low.

In contrast, the convergence coefficient gives an indication that the centrifugal tendency in Israel, Poland, and Turkey is very high. In these proportional representation systems with highly fragmented polities the convergence coefficients are significantly greater than 2 failing to meet the necessary condition for convergence to the electoral mean.

In the anocracies of Georgia, Russia, and Azerbaijan, where the President/autocrat dominates and controls who can run in legislative elections, the convergence coefficient is not significantly different from the dimension of the policy space (2 for Georgia and Russia and 1 for Azerbaijan), failing the necessary condition for convergence. While the analysis Georgia and Azerbaijan show that not all parties converge to the mean, in Russia it is likely that they did. Thus, in Russia opposition parties found it difficult to diverge from the mean. Note that convergence in anocracies may not generate a stable equilibrium as any change in the valence of the autocrat and the opposition may cause parties to diverge from the mean and may even lead to popular uprising that bring about changes in the governing parties such as in Georgia in previous elections or in the Arab revolutions.

We can also classify polities using the *effective vote number* and the *effective seat number.* (Fragmentation can be identified with the *effective number*. That is, let (the Herfindahl index) be the sum of the squares of the relative vote shares and let be the *effective number of party vote strength*. In the same way we can define *ens* as the effective number of party seat strength using shares of seats. See Laakso and Taagepera [15].) We examine how these two measures of fragmentation relate to the convergence coefficient for the polities we consider. The effective vote or seat numbers give an indication of the difficulty inherent in interparty negotiation over government. These two measures do not, however, address the fundamental aspect of democracy, namely, the electoral preferences for policy. Since convergence involves both preferences, in terms of the electoral covariance matrix and the effect of the electoral system, we argue that the Valence Theorem and the associated convergence coefficient allow for a more comprehensive way of classifying polities and political systems precisely because it is derived from the fundamental characteristics of the electorate. That is, while we can use the effective vote and seat number to identify which polities are fragmented, the Valence Theorem and the convergence result can help us understand why parties locate close to or far from the electoral mean and how, under some circumstances, these considerations lead to political fragmentation.

The next section presents Schofield’s [9] stochastic formal model of elections and implications it has for convergence to the mean. Section 3.1 applies the model to the elections to two plurality polities: The United States and the United Kingdom. In Section 3.2 we apply the model to polities using proportional electoral systems, namely, Israel, Turkey, and Poland. Section 3.3 considers the convergence coefficients for three “anocracies:” Azerbaijan, Georgia, and Russia. Comparisons between different fragmentation measures and the convergence coefficient are examined in Section 4. Section 5 concludes the paper. In the appendix we estimate the confidence intervals for the convergence coefficient as well as determining whether the low valence party has an incentive to deviate from the electoral mean.

#### 2. The Spatial Voting Model with Valence

Recent research on modelling elections has followed earlier work by Stokes [3, 4] and emphasized the notion of valence of political candidates. As Sanders et al. [13] comment, valence theory extends the spatial or Downsian model of elections by considering not just the policy positions of parties but also the parties’ rival attractions in terms of their perceived ability to handle the most serious problems that face the country. Thus, voters maximize their utilities by choosing the party that they think is best able to deliver policy success.

Schofield and Sened [16] have also argued that Valence relates to voters’ judgments about positively or negatively evaluated conditions which they associate with particular parties or candidates. These judgements could refer to party leaders’ competence, integrity, moral stance or “charisma” over issues such as the ability to deal with the economy and politics.

Valence theory has led to a considerable theoretical literature on voting based on the assumption that valence plays an important role in the relationship between party positioning and the votes that parties receive. (Ansolabare and Snyder [17], Groseclose [18], Aragones and Palfrey [19, 20], Schofield [21], Schofield et al. [22], Miller and Schofield [23], Schofield and Miller [24], Peress [25]) Empirical work, based on multinomial logit (MNL) methods, has also shown the importance of electoral judgements in analyses of elections in the United States and the United Kingdom. (Clarke et al. [8, 26–28], Schofield [29], Schofield et al. [30, 31], Scotto et al. [7]) These empirical models of elections have a “probabilistic” component. That is, they all assume that “voter utility” is partly “Downsian” in the sense that it is based on the distance between party positions and voter preferred positions and partly due to valence. The estimates of a party’s valence is assumed to be subject to a “stochastic error.” In this paper we use the same methodology.

The pure “Downsian” spatial model of voting tends to predict that parties will converge to the center of the electoral distribution [10–12]. However, when valence is included, the prediction is very different. To see this suppose there are two parties, A and B, and both choose the same position at the electoral center, but A has much higher valence than B. This higher valence indicates that voters have a bias towards party A and as a consequence more voters will choose A over B. The question for B is whether it can gain votes by moving away from the center. It should be obvious that the optimal position of both A and B will depend on the various estimated parameters of the model. To answer this question we now present the details of the spatial model.

##### 2.1. The Theoretical Model

To find the optimal party positions to the anticipated electoral outcome we use a Downsian vote model that has a valence component as presented in Schofield [9]. Let the set of parties be denoted by . The positions of the parties (We will use candidate, party and agents interchangeably throughout the paper.) in where is the dimension of the policy space it is given by the vector Denote voter ’s ideal policy be and her utility by , where Here is the observable component of the utility voter derived from party . The competence valence of candidate is , and the competence valence vector is such that , so that party 1 has the lowest valence. Note that is the same for all voters and provides an estimate of the “quality” of party or its ability to govern. The term is simply the Euclidean distance between voter ’s position and candidate ’s position . The coefficient is the weight given to this policy difference. The error vector has a Type I extreme value distribution, where the variance of is fixed at . Note that has dimension , where is whatever unit of measurement used in .

Since voter behavior is modeled by a probability vector, the probability that voter chooses party when parties position themselves at is Here stands for the probability operator generated by the distribution assumption on . Thus, the probability that votes for is given by the probability that for all , that is, that gets a higher utility from than from any other party.

Train [32] showed that when the error vector has a Type I extreme value distribution, the probability has a Multinomial Logit (MNL) specification and can be estimated. Thus, for each voter and party , the probability that voter chooses party at the vector is given by

Voters decisions are stochastic in this framework. (See, for example, the models of McKelvey and Patty [14]. Note that there is a problem with the independence of irrelevant alternatives assumption (IIA) which can be avoided using a probit model [33]. However, Quinn et al. [34] have shown that probit and logit models tend to give very similar results. Indeed the results given here for the logit model carry through for probit, though they are less elegant.) Even though parties cannot perfectly anticipate how voters will vote, they can estimate the *expected* vote share of party as the average of these probabilities as follows:
We assume a party’s objective is to find the position that maximizes its expected vote share, as desired by “Downsian” opportunists. On the other hand, the party may desire to adopt a position that is preferred by the base of the party supporters, namely, the “guardians” of the party, as suggested by Roemer [35].

We assume that parties can estimate how their vote shares would change if they *marginally* move their policy position. The Local Nash Equilibrium (LNE) is that vector of party positions such that no party may shift position by a small amount to increase its vote share. More formally a LNE is a vector such that each vote share is weakly locally maximized at the position . To avoid problems with zero eigenvalues we also define a SLNE to be a vector that *strictly* locally maximizes .

Using the estimated MNL coefficients we simulate these models and then relate any vector of party positions, , to a vector of vote share functions , predicted by the particular model with parties. Moreover, we can examine whether in equilibrium parties position themselves at the electoral mean. (The electoral mean or origin is the mean of all voters’ positions, normalized to zero, so that .) We call this vector *the electoral mean*.

Given the vector of policy position , and since the probability that voter votes for party is given by (4), the impact of a *marginal* change in ’s position on the probability that votes for is then
where indicates that we are holding the positions of all parties but is fixed. The effect that ’s change in position has on the probability that votes for depends on the weight given to the policy differences with parties, ; on how likely is to vote for , , and for any other party, and on how far apart ’s ideal policy is from ’s, .

From (5), party adjusts its position to maximize its expected vote share, that is, ’s first order condition is
where the third term follows after substituting in (6). The FOC for party in (7) is satisfied when
so that the *candidate* for party ’s vote maximizing policy (See Schofield [36] for the proof.) is
where represents the weight that party gives to voter when choosing its candidate vote maximizing policy. This weight depends on how likely is to vote for , , and for any other party, relative to all voters. (For example, if all voters are equally likely to vote for , say with probability , then the weight party gives to voter in its vote maximizing policy is ; that is, the weight gives each voter is just the inverse of the population size.) Note that may be nonmonotonic in . To see this exclude voter from the denominator of . When then . Thus, if will for sure vote for , receives a lower weight in ’s candidate position than a voter who will only vote for with probability (an “undecided” voter). Party caters then to “undecided” voters by giving them a higher weight in ’s policy weight and thus a higher weight on its position. This is the most common case. When , then increases in . If expects a large enough vote share (excluding voter ), it gives a core supporter (a voter who votes for sure for ) a higher weight in its policy position than it gives other voters as there is no risk of doing so. The weights are endogenously determined in the model.

Note that since voter ’s utility depends on how far is from party , the probability that votes for given in (4) and the expected vote share of the party given in (5) are influenced by the voters and parties positions in the policy space. That is, in the empirical models estimated below, the positions of voters and parties in the policy space, together with the valence estimates, influence voters electoral choices.

Recall that we are interested in finding whether parties converge to or diverge from the electoral mean. Suppose that *all* parties locate at the same position, for all . Thus, from (2) we see that
so the probability that votes for in (4) is given by
Clearly, in this case, is independent of voter ’s ideal point. Thus, from (9), the weight given by to each voter is also independent of voter ’s position and given by
so that gives each voter equal weight in its policy position. In this case, from (9), ’s candidate position is
that is, ’s candidate position is to locate at the electoral mean which we have placed at the electoral origin. Let be the vector of party positions when all parties are at the electoral mean.

Moreover, as (11) indicates when parties locate at the mean , only valence differences between parties matter in voters’ choices. The probability that a generic voter votes for party 1 (the party with the lowest valence) is

Using this spatial model, Schofield [9] proved a *Valence Theorem* determining whether vote maximizing parties locate at the mean. The theorem showed that the spatial model is characterized by a *convergence coefficient* given by
The convergence coefficient depends on , the weight given to policy differences; on , the probability that a generic voter votes for the lowest valence party at the vector and on , the *electoral variance* given by
where is the symmetric *electoral covariance matrix*. ( is simply a description of the distribution of voter preferred points taken about the electoral mean.)

The convergence coefficient increases in and (and on its product ) and decreases in . As (14) indicates decreases if the valence differences between party 1 and the other parties increases, that is, when the difference between and increases.

The Valence Theorem allows us to characterize polities according to the value of their convergence coefficient. The theorem states that when the *sufficient* condition for convergence to the electoral mean is met, that is, when , the LNE is one where all parties adopt the same position at the mean of the electoral distribution. A *necessary* condition, for convergence to the electoral mean is that , where is the dimension of the policy space. If , then there may exist a nonconvergent LNE. Note that in this case, there may indeed be no LNE. However, there will exist a mixed strategy Nash equilibrium (MNE). In either of these two cases we expect at least one party will *diverge* from the electoral mean.

Note that is dimensionless, because has no dimension. In a sense is a measure of the polarization of the preferences of the electorate. Moreover, in (14) is a function of the distribution of beliefs about the competence of party leaders, which is a function of the difference .

When some parties have a low valence, so the probability that a generic voter votes for party 1 (with the lowest valence when all parties locate at the origin), in (14) will tend to be small because the valence differences between party 1 and the other parties is *sufficiently large*. Thus, vote maximizing parties will *not all* converge to the electoral mean. In this case will be close to . If is large because, for example, the electoral variance is large, then will be large, suggesting . In this case, the low valence party has an incentive to move away from the origin to increase its vote share. This implies the existence of a *centrifugal* force pulling some parties away from the origin.

Thus, for sufficiently large so that , we expect parties to diverge from the electoral center. Indeed, we expect those parties that exhibit the lowest valence to move further away from the electoral center, implying that the centrifugal force on parties will be significant. Thus, in fragmented polities with a polarized electorate, the nature of the equilibrium tends to maintain this *centrifugal* characteristic.

On the contrary, in a polity where there are no very small or low valence parties, will tend to and so will be small. In a polity with small and with low valence differences, so that , we expect all parties to converge to the center. In this case, we expect this *centripetal* tendency to be maintained.

The convergence coefficient is a way of characterizing the Hessian (the by *second* derivatives of the vote share function) of party 1 with the lowest valence. The Hessian of the vote share function of party 1 is given by the characteristic matrix
Here is a by identity matrix and the other terms are as before. The eigenvalues of determine whether the vote share function of party 1 will be at a maximum, minimum, or at a saddlepoint at the electoral mean. If shows that party 1 is at a minimum or at a saddlepoint at the mean then party 1 has an incentive to locate away from the mean to increase its vote share. When all parties are at the mean and , then all eigenvalues of the Hessian of the vote share function of the lowest valence party are negative indicating that the vote share function is at a maximum. The LNE must then be at the electoral mean.

For an arbitrary dimension, , if in (15), then . In the two-dimensional case, if , then must be positive, implying that both eigenvalues of are negative. It then follows that all have negative eigenvalues, giving a SLNE and thus an LNE at the electoral mean. (This result follows from the application of the triangle inequality to the determinant. A parallel result can be obtained in more than two dimensions.)

The Valence Theorem asserts that if then the party with the lowest valence has an incentive to move away from the electoral mean to increase its vote share. When this is the case then other low valence parties may also find it advantageous to vacate the center. The value of the convergence coefficient, together with the analysis of the Hessians of the low valence parties, allows us to identify which parties have an incentive to move away from the electoral mean. The convergence coefficient then gives an easy and intuitive way to identify whether a low valence party should vacate the electoral mean.

In the next section, we estimate the convergence coefficient of various elections in different countries.

#### 3. MNL Models of the Elections of Various Countries

We use the framework of the spatial model presented in Section 2 as a unifying methodology that allows us to study convergence across elections, countries, and political regimes. The Valence Theorem leads to the convergence coefficient of the election, a summary statistic that determines whether parties converge to or diverge from the electoral mean. Using this formal multinomial (MNL) spatial model, we now estimate the convergence coefficient for the elections in various countries. For each MNL estimation we choose a baseline party and normalize its coefficients to zero, then estimate the coefficients of all other parties relative to those of the base party. Using these coefficients we estimate the convergence coefficient and the characteristic matrix of the low valence parties to determine whether these parties converge to or diverge from the electoral mean in each election for each country. (These elections were studied in depth elsewhere. In this paper, we present only the calculations leading to the convergence coefficient and estimate the confidence intervals for the convergence coefficients that were not provided in earlier work.)

We study convergence under three political regimes (plurality, proportional representation, and anocracy) and group countries according to the similarities of their political regimes. Under plurality rule, we examine elections in two Anglo-Saxon countries: the US and the UK; under proportional representation we study Israel, Turkey, and Poland; and under anocracy, Georgia, Russia, and Azerbaijan. Since we use the same unifying methodology for all countries we present the methodology for the first elections in detail then condense the analysis to its basic components for the remaining countries. For each country we give a general description of the analysis and direct the reader to the full analysis of each election in the detailed country paper. We summarize the results across countries in various tables.

##### 3.1. Convergence in Plurality Systems

We begin our analysis by examining the United States and the United Kingdom. Elections in these countries are carried out under plurality rule. We show that the electoral system in these countries produces relatively low convergence coefficients. (Relative to the convergence coefficient of other countries included in this study. In Section 4 we discuss how the values of the convergence coefficient are related to the political systems under which the countries operate.)

###### 3.1.1. The 2000, 2004, and 2008 Elections in the United States

We construct stochastic models of the 2000, 2004, and 2008 US presidential elections using survey data taken from the American National Election Surveys (ANES). The factor analysis done on ten survey questions taken from the ANES (See Schofield et al. [30, 31] for the list of survey questions and the factor loadings and the full analysis of the US elections.) led us to conclude that voters preferences can be represented along the economic (-axis) and social (-axis) dimensions for all three elections. Voters located on the left of the economic axis are pro-redistribution. The social axis is determined by attitudes to abortion and gays. We interpreted greater values along this axis to mean more support for certain civil rights. Using the factor loadings we estimated each voter’s position in these two dimensions. Figures 1, 2, and 3 give a smoothing of the estimated voter distribution of the 2000, 2004, and 2008 elections, respectively.

Voters’ ideal points in the *2000 US election* are characterized by the following *electoral covariance matrix*:
The trace of electoral covariance matrix is . Given the negative covariance between these two dimensions, , the correlation between these two factors is .

Using the spatial model presented in Section 2, we estimated the MNL model of the 2000 election. The coefficients for the US 2000 shown in Table 1 are
Bush’s competence valence, , measures the common perception that voters in the sample have on Bush’s ability to govern and represents the nonpolicy component in the voter’s utility function in (2). As seen in Table 1, for the 2000 election Bush has a statistically significant *lower* valence than Gore, the democratic (baseline) candidate. Bush’s negative valence is an indication that voters regarded him as less able to govern than Gore, once policy differences are taken into account.

| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

prob < 0.05; prob < 0.01; prob < 0.001.^{
b}US: Rep: Republican; Dem: Democrats.^{
c}UK: Lab: Labour; Con: Conservatives; Lib: Liberal Democrats. |

To find the convergence coefficient for this election, we assume that *all* parties locate at the electoral mean so that parties differ only in their valence terms (see Section 2). We can use (14) and the coefficients in (19) to estimate the probability that a typical US voter chooses to vote for the low valence Republican candidate, when both Bush and Gore locate at origin, ; that is,
We found the estimate for using the MNL valence estimates. Note that since the central estimates of given by the MNL regressions depend on the sample of voters surveyed then so does . Thus, to make inferences from empirical models we need the 95% confidence bounds of . In the introduction of the appendix we derive the methodology used to find the confidence bounds of . The bounds of are calculated in Appendix A.1.

The results indicate that in the 2000 election, both candidates found it in their best interest to locate at the electoral mean. To see this, we compute the convergence coefficient using (15) and the electoral covariance matrix in (18) to determine whether the two parties converge to, or diverge from, the electoral mean.

Using (19) and (20) we have that and from (18) the trace is so that using (15) the convergence coefficient for 2000 US election is Appendix A.1 shows that is significantly less than 1 implying that meets the sufficient and thus necessary condition for convergence to the electoral mean given in Section 2.

To check whether Bush, the low valence candidate, has an incentive to stay at the electoral origin, , that is, whether Bush’s vote share function is at a maximum at , we use the Hessian or characteristic matrix (of *second* order conditions) of Bush’s vote share function using (17) at as follows:
Because the characteristic matrix for Bush is estimated using the MNL coefficients of the 2000 US sample, depends on the sample of voters surveyed. The confidence bounds on in Appendix A.1 suggest that if Bush positions himself at the electoral origin, then with probability exceeding 95%, his vote share function would be at a maximum. We infer that, with probability exceeding 95%, the origin is an LNE for the spatial model for the 2000 US election. The valence differences between Bush and Gore are not large enough to cause either of them to move from the origin. The unique local Nash equilibrium was one where both candidates converge to the electoral origin in order to maximize their vote shares.

All the components needed to derive the convergence coefficient for 2000 US election and its confidence bounds are summarized in Table 2.

| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Conf. Int.: confidence intervals.^{
b}US: Dem: Democrats; Rep: Republican.^{
c}UK: LibDem: Liberal Democrats. |

Bush faced Kerry as the democratic candidate in the *2004 US election*. The distribution of voters in 2004 gives the following electoral covariance matrix along the economic and social dimensions:
While the covariance between economic and social axes differs, the trace is similar to that in the 2000 US election.

From Table 1, the MNL estimates of the spatial model for the 2004 US election are Bush has a significantly lower valence () than Kerry (), the baseline candidate.

From (14) the probability that a US voter chooses Bush, the low valence candidate, when both Bush and Kerry are at the electoral origin, , is The confidence bounds for are given in Appendix A.1. Since Bush’s valence, relative to that of his opponent, was similar in the two elections, it is not surprising that the probability of voting Republican is similar in the two elections, compare (20) and (25). From (15), and , so that the convergence coefficient of the 2004 election is Since is significantly less than 1 (see Appendix A.1), the sufficient condition for convergence given in Section 2 is met. Moreover, from (17) Bush’s characteristic matrix is If Bush positions himself at the electoral origin, then with probability exceeding 95% (see Appendix A.1), his vote share function would be at a maximum. Bush, the low valence candidate, has then no incentive to move from the origin, . With probability exceeding 95%, the mean is an LNE for model of the 2004 US election.

Our analysis suggests that Obama’s victory over McCain in the *2008 US election* was the result of an overall shift in the relative valences of the Democratic and Republican candidates as compared to those of the candidates in the 2000 and 2004 elections. The electoral covariance matrix for the sample in 2008 along the economic and social dimensions is
Relative to the two previous elections the “variance” of the electoral distribution increased, while the covariance between these dimensions decreased.

The MNL estimates of the spatial model given in Table 1 for the 2008 US election are Obama, the baseline candidate, has a significantly higher valence than McCain.

From (14) the probability that a voter chooses McCain, when both candidates are at the origin, , is From (15), , and , so the convergence coefficient is Appendix A.1 shows that is significantly greater than 1 and significantly less than 2. The Valence Theorem then states that the necessary but not the sufficient condition for convergence has been met. To check whether the low valence candidate, McCain, has an incentive to move from the electoral mean, we examine McCain’s characteristic matrix using (17) to get With probability exceeding 95% (see Appendix A.1) McCain’s vote share function is at a maximum when he locates at the origin, and thus has no incentive to move. Thus, with probability exceeding 95%, the electoral origin is an LNE for the spatial model for the 2008 US election.

In conclusion, Table 2 illustrates that the convergence coefficient varies across elections in the same country even when there are only two parties. This is to be expected as from (15) the convergence coefficient depends on the “variance” of the electoral distribution, ; on the weight voters give to differences with party’s policies, ; and on the probability that a voter chooses the party with the lowest valence, . The electoral distributions of the 2000 and 2004 are quite similar, as can be seen by comparing (18) and (23). Voters’ preferences had however substantially changed by 2008, see (28). The electoral variance along both axes increased relative to 2000 and 2004. While the 2000 and 2004 convergence coefficients are indistinguishable from each other, the 2008 coefficient is significantly different from that in 2000 and 2004. In spite of these differences, candidates in all three elections had no incentive to move from the origin.

###### 3.1.2. The 2005 and 2010 Elections in Great Britain

We study the 2005 and 2010 elections in the UK using the British Election Study (BES). (The full analysis of the 2005 and 2010 elections in Great Britain can be found in Schofield et al. [37].) The factor analysis conducted on the questions of the two surveys led us to conclude that the same two dimensions mattered in voter choices in the two elections. The first factor deals with issues on “EU membership,” “Immigrants,” “Asylum seekers,” and “Terrorism.” A voter who feels strongly about nationalism has a high value in the *nationalism* dimension (-axis). Items such as “tax/spend,” “free market,” “international monetary transfer,” “international companies,” and “worry about job loss overseas” have strong influence in the *economic* (-axis) dimension with higher values indicating a promarket attitude. Figures 4 and 5 present the smoothed electoral distribution obtained from these analyses for the 2005 and 2010 elections.

The electoral covariance matrix for the *2005 UK election* is
where .

From Table 1, the MNL estimates of the spatial model for the 2005 UK are Both the Labour () and the Conservative () parties had a significantly higher valence than the Liberal Democrats (), the baseline party.

From (14), the probability that a voter chooses the Liberal Democratic Party, the lowest valence party, when all parties locate at the origin, , is

Given that and since in (33), from (15) the convergence coefficient, in Table 2, is Appendix A.1 shows that is significantly less than 1 and thus meets the sufficient and necessary conditions for convergence given in Section 2. From (17) the characteristic matrix of the Liberal Democratic Party is From the 95% confidence bounds in Appendix A.1, we conclude that if the LibDem locates at the origin, it is maximizing its vote share and has no incentive to vacate the center. Thus, with probability exceeding 95%, the origin is an LNE for the 2005 UK election.

The electoral covariance matrix for the *2010 UK election* is
where , lower than in 2005.

From Table 1, the MNL estimates of the spatial model of the 2010 election are Given the great popular discontent with Gordon Brown, the Labour leader, heading into the 2010 election, it is not surprising to find that both Conservatives and Liberal Democrats (the base party) had significantly higher valences than Labour.

From (14) the probability that a voter chooses Labour, when all parties locate at the origin, , is Since and in (38), from (15) the convergence coefficient, in Table 2, is The convergence coefficient is significantly less than 1 (see Appendix A.1), meeting the sufficient and thus necessary condition for convergence. From (17), Labour’s characteristic matrix is If Labour, the low valence party, locates at the origin, then with probability exceeding 95%, its vote share function is at a maximum (see Appendix A.1) giving it no incentive to move from the mean. Thus, with probability exceeding 95%, the electoral origin is an LNE for the 2010 UK election.

The major shift in voters’ preferences between the two elections led to very different electoral outcomes as evidenced by the electoral covariance matrices in (33) and (38). Voter dissatisfaction with the governing Labour leader led to a dramatic decrease in his competence valence and on the probability of voting Labour. Even though the electoral variance fell in 2010 relative to 2005, the increase in the convergence coefficient meant that this lower variance was more than compensated by the lower probability of voting Labour in 2010. The analysis for the UK elections shows that the convergence coefficient reflects not only changes in the electoral distribution but also changes in voters’ valence preferences as the convergence coefficient of the 2005 election is substantially lower than the one for the 2010 election.

The analysis of these two Anglo-Saxon countries illustrate that even under plurality rule the convergence coefficient varies from election to election and from country to country. The analysis for the 2010 UK election highlights that candidates’ valences matter and that parties understand how their valence affects their electoral prospects and may adjust their positions to increase their votes. This section illustrates that under plurality, the convergence coefficient has low values that generally satisfy the necessary condition for convergence to the mean and is thus below the dimension of the policy space.

##### 3.2. Convergence in Proportional Systems?

We now estimate the convergence coefficients for three parliamentary countries using proportional representation: Israel, Turkey, and Poland. As is well known, these countries are characterized by multiparty elections in which generally no party wins a legislative majority leading then to coalitions governments. This section shows that these countries are characterized by very high convergence coefficients.

###### 3.2.1. The 1996 Election in Israel

In the 1996, as in previous elections, Israel had approximately nineteen parties attaining seats in the Knesset. (These include parties on the left, on the center, on the right, as well as religious parties. On the left there is Labor, Merets, Democrat, Communists and Balad; those on the center include Olim, Third Way, Center, Shinui; those on the right Likud, Gesher, Tsomet and Yisrael. The religious parties are Shas, Yahadut, NRP, Moledet, and Techiya.) There were small parties with 2 seats to moderately large parties such as Likud and Labor whose seat strengths lie in the range 19 to 44, out of a total of 120 Knesset seats. Since Likud and Labour compete for dominance of coalition government, these large parties must maximize their seat strength. Moreover, Israel uses a highly proportional electoral system with close correspondence between seat and vote shares. Thus one can consider vote shares as the maximand and for these parties.

Schofield et al. [30] performed a factor analysis of the surveys conducted by Arian and Shamir [38] for the 1996 Israeli election. The two dimensions identified by the factor analysis were Security (-axis) and Religion (-axis). “Security” refers to attitudes toward peace initiatives; “religion” to the significance of religious considerations in government policy. A voter on the left of the security axis is interpreted as supporting negotiations with the PLO, while higher values on the religious axis indicates support for the importance of the Jewish faith in Israel. The distribution of voters is shown in Figure 6.

Voter distribution along these two axes gives the following covariance matrix: giving a “variance” of .

Only the seven largest parties are included in the MNL estimation. These include Likud, Labor, NRP, Moledat, Third Way (TW), and Shas with Meretz being the base party. From Table 2, the MNL coefficients for the 1996 election in Israel () are The -coefficient and the valence estimates for all parties are significantly nonzero. The two largest parties, Likud and Labour, have significantly higher valences than the other smaller parties with Third Way (TW) having the smallest valence.

From (14), the probability that an Israeli votes for TW, when all parties locate at the mean is

Given that and since from (43), then using (15) we compute the convergence coefficient for Israel, in Table 4, as

The 95% confidence intervals for in Appendix A.2 confirm that the necessary condition is *not* satisfied as is significantly higher than 2, the dimension of the policy space. Moreover, at the electoral mean the vote share function of Third Way is *not* at a maximum since its Hessian from (17)
shows that if TW locates at the mean its vote share function is at a saddlepoint since has one positive (2.453) and one negative eigenvalue. Appendix A.2 confirms that has one negative and one positive eigenvalue at both its lower and upper bounds. Thus, with a high degree of certainty TW deviates from the mean to maximize its votes and the electoral mean is *not* a LNE for the 1996 Israeli election.

###### 3.2.2. The 1999 and 2002 Elections in Turkey

We used factor analysis of electoral survey data of Veri Arastima for TUSES to study the 1999 and 2002 Turkish elections. (See Schofield et al. [39] for details of the estimation.) The analysis indicates that voters made decisions in a two-dimensional space during the two elections. Voters who support secularism or “Kemalism” are placed on the left of the Religious () axis and those supporting Turkish nationalism () to the north. Figures 7 and 8 give the distribution of voters along these two dimensions surveyed in these two elections.

Minor differences between these two figures include the disappearance of the Virtue Party (FP) which was banned by the Constitutional Court in 2001 and the change of the name of the pro-Kurdish party from HADEP to DEHAP. (For simplicity, the pro-Kurdish party is denoted HADEP in the various figures and tables. Notice that the HADEP position in Figures 8 and 9 is interpreted as secular and nonnationalistic.) The most important change is the emergence of the new Justice and Development Party (AKP) in 2002, essentially substituting for the outlawed Virtue Party.

The parties included in the analysis of the 1999 election are the Democratic Left Party (DSP), the National Action party (MHP), the Vitue Party (VP), the Motherland Party (ANAP), the True Path Party (DYP), the Republican People’s Party (CHP), and the People’s Democratic Party (HADEP). A DSP minority government formed, supported by ANAP and DYP. This only lasted about 4 months and was replaced by a DSP-ANAP-MHP coalition, indicating the difficulty of negotiating a coalition compromise across the disparate policy positions of the coalition members.

In *the 1999 election*, the electoral covariance matrix along the Religious () and Nationalism () axes is
with .

Using DYP as the base party, from Table 3, the 1999 MNL coefficients are The -coefficient and the valence estimates of DSP and MHP and CHP are significantly nonzero. The probability that a Turkish voter chooses FP with lowest valence in 1999, when all parties locate at the mean, in (14), is

| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

prob < 0.05; prob < 0.01; prob < 0.001.^{
b}Israel: Lik: Likud; Lab: Labor; NRP: Mafdal; Mo: Moledet; TW: Third Way.^{
c}Poland: SLD: Democratic Left Alliance; PSL: Polish People's Party; UW: Freedom Union; AWS: Solidarity Election.Action; UP: Labor Party; UPR: Union of Political Realism; ROP: Movement for Reconstruction of Poland; SO: Self Defense; PiS: Law and Justice; PO: Civic Platform; LPR: League of Polish Families; DEM: Democratic Party; SDP: Social Democracy of Poland. ^{
d}Turkey: DSP: Democratic Left Party; MHP: Nationalist Action Party; FP: Virtue Party; ANAP: Motherland Party; CHP: Republican People's Party; HADEP: People's Democracy Party; DYP: True Path Party. |

| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Central Est.: central estimate.^{
b}Conf. Int.: confidence intervals.^{
c}Israel: TW: Third Way.^{
d}Turkey: DYP: True Path Party.^{
e}Poland: ROP: Movement for Reconstruction of Poland. |

Given that and since in (48), then using (15), Turkey’s convergence coefficient in 1999, in Table 4, is
The convergence coefficient is significantly higher that 1 and significantly lower than 2 (see Appendix A.2). From (17) FP’s Hessian at the origin is
When at the electoral origin, FP’s characteristic function shows that its vote share function is at a saddlepoint as the eigenvalues of are with minor eigenvector and with major eigenvector . Moreover, as seen in Appendix A.2, the 95% confidence bounds show that at the lower bound of FP has no incentive to move but it does at the upper bound. Since FP wants to move at the central estimate of in (52) it is probable that in general FP wants to move away from the mean to increase its vote share. Moreover, since the convergence coefficient is significantly greater than 2, then with a high degree confidence, the electoral mean *cannot* be a LNE for Turkey in 1999.

The electoral covariance matrix of the *2002 Turkish election* is
with .

Note that the covariance matrix of 1999 in (48) and that of 2002 in (53) suggest few changes in the distribution of voters between these two election. Figures 8 and 9 suggest that there were few changes in party positions between these two elections. The basis of support for the AKP may be regarded as similar to that of the banned FP, suggesting that the leader of this party changed the party’s position on the religion axis, adopting a much less radical position. One would think of this as generating political stability in Turkey. Yet, between 1999 and 2002, Turkey experienced two severe economic crises and in 2002, a 10% electoral cut-off rule was instituted. The crises and the cut-off rule changed the political landscape in Turkey. In the 2002 election, seven parties obtained less than 10% of the vote and won no seats. The AKP won 34% of the vote, and due to the cut-off rule, obtained a majority of the seats (363 out of 550).

Our analysis reflects this change in the political landscape. Using DYP as the base party, from Table 3, the 2002 MNL coefficients are The -coefficient and the valences of AKP and CHP are significantly nonzero with ANAP having the lowest valence. The probability of voting ANAP, when parties locate at the mean, in (14), is

Given that and since from (53), then using (15) we find that the 2002 convergence coefficient for Turkey, in Table 4, is
The political changes induced by the cut-off rule led to a higher convergence coefficient in 2002 relative to 1999 (increasing from a low of in (51) to a high in (56)). An indication that a more fractionalized polity emerged from this reform. The convergence coefficient of the 2002 election is significantly above 2, the dimension of the policy space (see Appendix A.2) giving ANAP an incentive to locate far from the mean. ANAP’s characteristic matrix using (17) is
When at the origin, indicates that ANAP is minimizing its vote share since its eigenvalues are both positive (0.090 and 3.850). This together with the 95% confidence bounds in Appendix A.2 implies that there is a high probability that ANAP will vacate the center and that the mean is *not* an LNE for Turkey in 2002.

###### 3.2.3. The 1997 Polish Election

In the election held in Poland in 1997 (In this election Poland used an open-list proportional representation electoral system with a threshold of 5% nationwide vote for parties and 8% for electoral coalitions. Votes are translated into seats using the D’Hondt method.) the following five parties won seats in the Sejm (lower house). The left-wing excommunist Democratic Left Alliance (SLD) and the agrarian Polish Peoples’ Party (PSL), both of which have been the most frequent governing parties in the postcommunist period. The Freedom Union (UW) and the Solidarity Election Action (AWS) had grown out of the Solidarity movement. AWS combined various mostly right wing and Christian groups under one label, while UW was formed based on the liberal wing of Solidarity. The remaining party is the Movement for Reconstruction of Poland (ROP).

Applying factor analysis to questions from the Polish National Election Survey an economic and a social value dimensions were identified (see [40]). The *economic* dimension is influenced by issues such as privatization versus state ownership of enterprises, fighting unemployment versus keeping inflation and government expenditure under control, proportional versus flat income tax, support versus opposition to state subsidies to agriculture, and state versus individual social responsibility. The separation of church and state versus the influence of church over politics, complete decommunization versus equal rights for former nomenclature, and abortion rights regardless of situation versus no such rights regardless of situation are the most influential issues in this *social values* dimension. The distribution of voters along these dimensions is seen in Figure 9. (See Schofield et al. [40] for details of the estimation.)

The covariance matrix for the *1997 Polish* () *election* is
with variance .

From Table 3, the MNL coefficients for the 1997 election are The -coefficient and valence estimates for all parties except UP and PSL are significantly nonzero. The probability of voting UPR with lowest valence, in 1997, when parties locate at the mean, in (14), is

Given that and since from (58), then using (15) the convergence coefficient for Poland, in Table 4, is
Appendix A.2 shows that is significantly greater than 2 and thus fails the necessary condition for convergence to the mean. UPR’s Hessian from (17) is
The trace (= 3.82), the determinant (= 5.80), and the eigenvalues of are positive. The 95% confidence bound of in Appendix A.2 also shows positive eigenvalues at the lower and upper bounds of . Thus, with a high degree of certainty UPR locates far from the origin to maximize its votes and the electoral mean is *not* a LNE for 1997 Polish election.

Summarizing, in this section we examined three countries that use proportional representation. Their convergence coefficients are significantly higher than 2, the dimension of the policy space and are also much higher than that of the US and the UK. A high convergence coefficient signals then a high degree of political fractionalization in these multi-party parliamentary democracies.

##### 3.3. Convergence in Anocracies

We now study elections in Georgia, Russia, and Azerbaijan. In these partial democracies or anocracies, (The term “partial democracy” has been applied to new democracies lacking the full array of democratic institutions present in western democracies (see [41].)) the President/autocrat holds regular presidential and legislative elections while exerting undue influence on the elections. Anocracies lack important democratic institutions such as freedom of the press. Autocrats hold regular elections in an attempt to give their regime legitimacy. The autocrat “buys” legitimacy by rewarding their supporters and opposition members with well-paid legislative positions and give legislators the ability to influence policies. Opposition parties participate in elections to become known political entities. This allows them to regularly communicate with voters. Their objective is to oust the autocrat either in a future election or through popular uprisings. We assume that opposition parties maximize their vote share even when understanding that there is little chance of ousting the autocrat in the election.

###### 3.3.1. The 2008 Georgian Election

We use the postelection survey conducted by GORBI-GALLUP International from March 19 through April 3, 2008, to built a formal model of the 2008 election in Georgia (see [42]). The factor analysis done on the survey questions determined that there were two dimensions describing voters’ attitudes towards democracy and the west. One dimension is strongly related with the respondents’ attitude toward the US, the EU and NATO with larger values in the West (-axis) dimension implying a stronger anti-western attitude. Along the democracy (-axis) dimension larger values are associated with negative judgements on the current state of democratic institutions in Georgia, coupled with a demand for more democracy. The electoral distribution along these two dimensions is given in Figure 10. The points (S, G, P, N) in Figure 10 represent the estimated positions of the four candidates: Saakashvili (S), Gachechiladze (G), Patarkatsishvili (P), and Natelashvili (N). (See Schofield et al. [39] for details of the estimation.)

The 2008 electoral covariance matrix in the Democracy () and West () axes is with .

From Table 5, the MNL estimates of the 2008 election with Natelashvili as the base candidate are All coefficients are significantly nonzero showing Natelashvili as having the lowest valence.

| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

prob < 0.05; prob < 0.01; |