The Convergence Coefficient across Political Systems
Formal work on the electoral model often suggests that parties or candidates should locate themselves at the electoral mean. Recent research has found no evidence of such convergence. In order to explain nonconvergence, the stochastic electoral model is extended by including estimates of electoral valence. We introduce the notion of a convergence coefficient, . It has been shown that high values of imply that there is a significant centrifugal tendency acting on parties. We used electoral surveys to construct a stochastic valence model of the the elections in various countries. We find that the convergence coefficient varies across elections in a country, across countries with similar regimes, and across political regimes. In some countries, the centripetal tendency leads parties to converge to the electoral mean. In others the centrifugal tendency dominates and some parties locate far from the electoral mean. In particular, for countries with proportional electoral systems, namely, Israel, Turkey, and Poland, the centrifugal tendency is very high. In the majoritarian polities of the United States and Great Britain, the centrifugal tendency is very low. In anocracies, the autocrat imposes limitations on how far from the origin the opposition parties can move.
Work on modeling elections has often assumed that the policy space was restricted to one dimension or that there were at most two parties [1, 2]. The extensive formal literature on electoral competition has typically been based on the assumption that parties or candidates adopt positions in order to win and has inferred that parties will converge to the electoral median, under deterministic voting in one dimension, or to the electoral mean in stochastic models.
In this paper we offer a formal stochastic model of elections that emphasizes the importance of the idea of valence and use this notion to provide an explanation of why vote maximizing political leaders in some countries will not adopt convergent policy positions at the mean of the electoral distribution. In the standard spatial model, candidate positions matter to voters. However, as Stokes [3, 4] has emphasized, the nonpolicy evaluations, or valences, of candidates by the electorate are just as important. (See also Clarke et al. [5, 6], Scotto et al. , and Clarke et al. .)
The main objective of this paper is to examine whether parties locate close to or far from the electoral mean (the electoral mean is the mean of voters ideal policies dimension by dimension) of various countries. We use Schofield’s  stochastic electoral model as a unifying framework that allows us to compare parties positions across different political systems. In this model, parties respond to their partisan constituencies after taking into account the anticipated electoral outcome and the positions of other parties. Voters decisions depend on parties’ locations and on the party’s valence, the voters’ overall common evaluation of the ability of a party leader to provide good governance. Using this model we examine whether parties converge to electoral mean in several elections in various countries under different political systems and use convergence, or the lack thereof, to classify political systems.
To examine whether parties converge to the electoral mean in each country in a particular election, we test whether any party has an incentive to stay or move away from the electoral mean to increase its vote share. In formal voting theory, it is usual to define a “Nash equilibrium” as a vector of party positions with the property that no party may make a unilateral move so as to increase its vote share. We use a variant of this concept, that is a “local Nash equilibrium” (LNE) where we consider only marginal moves from the position. One of the standard results in formal theory is the mean voter theorem, where the “Nash equilibrium” of a spatial voting game under vote maximization is one where all parties position themselves at the electoral mean. (For variants of the theorem see Enelow and Hinich [10–12]). We call such a vector the electoral mean.
To study each party’s best response to the electoral situation they face, we use the results presented in Schofield . Schofield identifies a convergence coefficient, denoted , whose value determines whether vote maximizing parties converge or not to the electoral mean. This coefficient depends on various parameters of the model. In particular it depends on the competence valences of the party leaders. Using to denote the parties, the valence of party , , essentially measures the electoral perception of the “quality” of , the voters’ overall common evaluation of the ability of ’s leader to provide good governance. The valence terms, , are assumed to be independent of the party’s positions and can be estimated as the intercept term in the appropriate stochastic model of the voter utility function. As Sanders et al.  comment, valence theory is based on the assumption that “voters maximize their utilities by choosing the party that is best able to deliver policy success.” These valence terms measure the bias in favor of one another of the party leaders .
The convergence coefficient, , also depends on the weight that voters give to the policy differences they have with the various parties . Lastly, depends on the variance/covariance matrix of the electoral distribution, . By its construction, is dimensionless and thus independent of the units of measurement of the various parameters. We use the convergence coefficient to compare results across elections and countries and to classify political systems.
The convergence coefficient is a summary measure that provides an estimate of the centrifugal or centripetal forces acting on the parties. The Valence Theorem, presented in Section 2 (see Schofield  for the proof of this result) shows that if the policy space is two-dimensional and if , then the sufficient condition for convergence to the mean has been met and the “local Nash Equilibrium” (LNE) (the set of such local Nash Equilibria contains the set of Nash Equilibria) is one where all parties locate at the electoral mean. On the other hand, if is the dimension of the policy space and , then the LNE, if it exists, will be one where at least one party will have an incentive to diverge from the electoral mean in order to maximize its vote share. Thus, the necessary condition for convergence to the mean is that .
In essence, a high empirical convergence coefficient of an election is a convenient measure of the electoral incentive of a small, or low valence, party to move away from the electoral mean to its core constituency position. We can interpret a high value of the convergence coefficient as a measure of the centrifugal tendency exerted on parties pulling them away from the electoral mean. The convergence coefficient is therefore a convenient, simple, and intuitive way to examine whether parties will have an incentive to locate close to, or far from, the electoral mean. We will show that there is a strong connection between the values of the convergence coefficient and the nature of the political system under which parties operate.
We used preelection polls to study elections in several countries operating under different political regimes. The factor analysis done on preelection surveys showed that for all elections the policy space was two-dimensional, except in Azerbaijan were it was one-dimensional. The position of voters along this two-dimensional space were then estimated and their voting intentions used to estimate party positions. We then ran a multinomial logit (MNL) model for the election using the estimated party and voter positions. The intercept of the MNL model gives the valences of each party/leader. Following Schofield’s  formal model, we rank parties according to their valence. Using these MNL estimates we calculate the convergence coefficient of the election and examine whether the party with the lowest valence has an incentive to locate close to or far from the electoral origin.
When comparing the convergence coefficients across countries we observe that in countries with proportional representation the convergence coefficient is high and that in countries with plurality systems or in anocracies it is low. Thus, suggesting that we can use the valence theorem and its associated convergence result to classify electoral systems.
The convergence coefficients for the 2005 and 2010 elections in the UK were not significantly different from 1, meeting the necessary condition for convergence to the mean. For the 2000, 2004, and 2008 US presidential elections, the convergence coefficient is significantly below 1 in 2000 and 2004 thus meeting the sufficient and thus necessary condition for convergence and not significantly different from 1 in 2008, only meeting the necessary condition for convergence. We suggest that the centrifugal tendency in the majoritarian polities like the United States and the United Kingdom is very low.
In contrast, the convergence coefficient gives an indication that the centrifugal tendency in Israel, Poland, and Turkey is very high. In these proportional representation systems with highly fragmented polities the convergence coefficients are significantly greater than 2 failing to meet the necessary condition for convergence to the electoral mean.
In the anocracies of Georgia, Russia, and Azerbaijan, where the President/autocrat dominates and controls who can run in legislative elections, the convergence coefficient is not significantly different from the dimension of the policy space (2 for Georgia and Russia and 1 for Azerbaijan), failing the necessary condition for convergence. While the analysis Georgia and Azerbaijan show that not all parties converge to the mean, in Russia it is likely that they did. Thus, in Russia opposition parties found it difficult to diverge from the mean. Note that convergence in anocracies may not generate a stable equilibrium as any change in the valence of the autocrat and the opposition may cause parties to diverge from the mean and may even lead to popular uprising that bring about changes in the governing parties such as in Georgia in previous elections or in the Arab revolutions.
We can also classify polities using the effective vote number and the effective seat number. (Fragmentation can be identified with the effective number. That is, let (the Herfindahl index) be the sum of the squares of the relative vote shares and let be the effective number of party vote strength. In the same way we can define ens as the effective number of party seat strength using shares of seats. See Laakso and Taagepera .) We examine how these two measures of fragmentation relate to the convergence coefficient for the polities we consider. The effective vote or seat numbers give an indication of the difficulty inherent in interparty negotiation over government. These two measures do not, however, address the fundamental aspect of democracy, namely, the electoral preferences for policy. Since convergence involves both preferences, in terms of the electoral covariance matrix and the effect of the electoral system, we argue that the Valence Theorem and the associated convergence coefficient allow for a more comprehensive way of classifying polities and political systems precisely because it is derived from the fundamental characteristics of the electorate. That is, while we can use the effective vote and seat number to identify which polities are fragmented, the Valence Theorem and the convergence result can help us understand why parties locate close to or far from the electoral mean and how, under some circumstances, these considerations lead to political fragmentation.
The next section presents Schofield’s  stochastic formal model of elections and implications it has for convergence to the mean. Section 3.1 applies the model to the elections to two plurality polities: The United States and the United Kingdom. In Section 3.2 we apply the model to polities using proportional electoral systems, namely, Israel, Turkey, and Poland. Section 3.3 considers the convergence coefficients for three “anocracies:” Azerbaijan, Georgia, and Russia. Comparisons between different fragmentation measures and the convergence coefficient are examined in Section 4. Section 5 concludes the paper. In the appendix we estimate the confidence intervals for the convergence coefficient as well as determining whether the low valence party has an incentive to deviate from the electoral mean.
2. The Spatial Voting Model with Valence
Recent research on modelling elections has followed earlier work by Stokes [3, 4] and emphasized the notion of valence of political candidates. As Sanders et al.  comment, valence theory extends the spatial or Downsian model of elections by considering not just the policy positions of parties but also the parties’ rival attractions in terms of their perceived ability to handle the most serious problems that face the country. Thus, voters maximize their utilities by choosing the party that they think is best able to deliver policy success.
Schofield and Sened  have also argued that Valence relates to voters’ judgments about positively or negatively evaluated conditions which they associate with particular parties or candidates. These judgements could refer to party leaders’ competence, integrity, moral stance or “charisma” over issues such as the ability to deal with the economy and politics.
Valence theory has led to a considerable theoretical literature on voting based on the assumption that valence plays an important role in the relationship between party positioning and the votes that parties receive. (Ansolabare and Snyder , Groseclose , Aragones and Palfrey [19, 20], Schofield , Schofield et al. , Miller and Schofield , Schofield and Miller , Peress ) Empirical work, based on multinomial logit (MNL) methods, has also shown the importance of electoral judgements in analyses of elections in the United States and the United Kingdom. (Clarke et al. [8, 26–28], Schofield , Schofield et al. [30, 31], Scotto et al. ) These empirical models of elections have a “probabilistic” component. That is, they all assume that “voter utility” is partly “Downsian” in the sense that it is based on the distance between party positions and voter preferred positions and partly due to valence. The estimates of a party’s valence is assumed to be subject to a “stochastic error.” In this paper we use the same methodology.
The pure “Downsian” spatial model of voting tends to predict that parties will converge to the center of the electoral distribution [10–12]. However, when valence is included, the prediction is very different. To see this suppose there are two parties, A and B, and both choose the same position at the electoral center, but A has much higher valence than B. This higher valence indicates that voters have a bias towards party A and as a consequence more voters will choose A over B. The question for B is whether it can gain votes by moving away from the center. It should be obvious that the optimal position of both A and B will depend on the various estimated parameters of the model. To answer this question we now present the details of the spatial model.
2.1. The Theoretical Model
To find the optimal party positions to the anticipated electoral outcome we use a Downsian vote model that has a valence component as presented in Schofield . Let the set of parties be denoted by . The positions of the parties (We will use candidate, party and agents interchangeably throughout the paper.) in where is the dimension of the policy space it is given by the vector Denote voter ’s ideal policy be and her utility by , where Here is the observable component of the utility voter derived from party . The competence valence of candidate is , and the competence valence vector is such that , so that party 1 has the lowest valence. Note that is the same for all voters and provides an estimate of the “quality” of party or its ability to govern. The term is simply the Euclidean distance between voter ’s position and candidate ’s position . The coefficient is the weight given to this policy difference. The error vector has a Type I extreme value distribution, where the variance of is fixed at . Note that has dimension , where is whatever unit of measurement used in .
Since voter behavior is modeled by a probability vector, the probability that voter chooses party when parties position themselves at is Here stands for the probability operator generated by the distribution assumption on . Thus, the probability that votes for is given by the probability that for all , that is, that gets a higher utility from than from any other party.
Train  showed that when the error vector has a Type I extreme value distribution, the probability has a Multinomial Logit (MNL) specification and can be estimated. Thus, for each voter and party , the probability that voter chooses party at the vector is given by
Voters decisions are stochastic in this framework. (See, for example, the models of McKelvey and Patty . Note that there is a problem with the independence of irrelevant alternatives assumption (IIA) which can be avoided using a probit model . However, Quinn et al.  have shown that probit and logit models tend to give very similar results. Indeed the results given here for the logit model carry through for probit, though they are less elegant.) Even though parties cannot perfectly anticipate how voters will vote, they can estimate the expected vote share of party as the average of these probabilities as follows: We assume a party’s objective is to find the position that maximizes its expected vote share, as desired by “Downsian” opportunists. On the other hand, the party may desire to adopt a position that is preferred by the base of the party supporters, namely, the “guardians” of the party, as suggested by Roemer .
We assume that parties can estimate how their vote shares would change if they marginally move their policy position. The Local Nash Equilibrium (LNE) is that vector of party positions such that no party may shift position by a small amount to increase its vote share. More formally a LNE is a vector such that each vote share is weakly locally maximized at the position . To avoid problems with zero eigenvalues we also define a SLNE to be a vector that strictly locally maximizes .
Using the estimated MNL coefficients we simulate these models and then relate any vector of party positions, , to a vector of vote share functions , predicted by the particular model with parties. Moreover, we can examine whether in equilibrium parties position themselves at the electoral mean. (The electoral mean or origin is the mean of all voters’ positions, normalized to zero, so that .) We call this vector the electoral mean.
Given the vector of policy position , and since the probability that voter votes for party is given by (4), the impact of a marginal change in ’s position on the probability that votes for is then where indicates that we are holding the positions of all parties but is fixed. The effect that ’s change in position has on the probability that votes for depends on the weight given to the policy differences with parties, ; on how likely is to vote for , , and for any other party, and on how far apart ’s ideal policy is from ’s, .
From (5), party adjusts its position to maximize its expected vote share, that is, ’s first order condition is where the third term follows after substituting in (6). The FOC for party in (7) is satisfied when so that the candidate for party ’s vote maximizing policy (See Schofield  for the proof.) is where represents the weight that party gives to voter when choosing its candidate vote maximizing policy. This weight depends on how likely is to vote for , , and for any other party, relative to all voters. (For example, if all voters are equally likely to vote for , say with probability , then the weight party gives to voter in its vote maximizing policy is ; that is, the weight gives each voter is just the inverse of the population size.) Note that may be nonmonotonic in . To see this exclude voter from the denominator of . When then . Thus, if will for sure vote for , receives a lower weight in ’s candidate position than a voter who will only vote for with probability (an “undecided” voter). Party caters then to “undecided” voters by giving them a higher weight in ’s policy weight and thus a higher weight on its position. This is the most common case. When , then increases in . If expects a large enough vote share (excluding voter ), it gives a core supporter (a voter who votes for sure for ) a higher weight in its policy position than it gives other voters as there is no risk of doing so. The weights are endogenously determined in the model.
Note that since voter ’s utility depends on how far is from party , the probability that votes for given in (4) and the expected vote share of the party given in (5) are influenced by the voters and parties positions in the policy space. That is, in the empirical models estimated below, the positions of voters and parties in the policy space, together with the valence estimates, influence voters electoral choices.
Recall that we are interested in finding whether parties converge to or diverge from the electoral mean. Suppose that all parties locate at the same position, for all . Thus, from (2) we see that so the probability that votes for in (4) is given by Clearly, in this case, is independent of voter ’s ideal point. Thus, from (9), the weight given by to each voter is also independent of voter ’s position and given by so that gives each voter equal weight in its policy position. In this case, from (9), ’s candidate position is that is, ’s candidate position is to locate at the electoral mean which we have placed at the electoral origin. Let be the vector of party positions when all parties are at the electoral mean.
Moreover, as (11) indicates when parties locate at the mean , only valence differences between parties matter in voters’ choices. The probability that a generic voter votes for party 1 (the party with the lowest valence) is
Using this spatial model, Schofield  proved a Valence Theorem determining whether vote maximizing parties locate at the mean. The theorem showed that the spatial model is characterized by a convergence coefficient given by The convergence coefficient depends on , the weight given to policy differences; on , the probability that a generic voter votes for the lowest valence party at the vector and on , the electoral variance given by where is the symmetric electoral covariance matrix. ( is simply a description of the distribution of voter preferred points taken about the electoral mean.)
The convergence coefficient increases in and (and on its product ) and decreases in . As (14) indicates decreases if the valence differences between party 1 and the other parties increases, that is, when the difference between and increases.
The Valence Theorem allows us to characterize polities according to the value of their convergence coefficient. The theorem states that when the sufficient condition for convergence to the electoral mean is met, that is, when , the LNE is one where all parties adopt the same position at the mean of the electoral distribution. A necessary condition, for convergence to the electoral mean is that , where is the dimension of the policy space. If , then there may exist a nonconvergent LNE. Note that in this case, there may indeed be no LNE. However, there will exist a mixed strategy Nash equilibrium (MNE). In either of these two cases we expect at least one party will diverge from the electoral mean.
Note that is dimensionless, because has no dimension. In a sense is a measure of the polarization of the preferences of the electorate. Moreover, in (14) is a function of the distribution of beliefs about the competence of party leaders, which is a function of the difference .
When some parties have a low valence, so the probability that a generic voter votes for party 1 (with the lowest valence when all parties locate at the origin), in (14) will tend to be small because the valence differences between party 1 and the other parties is sufficiently large. Thus, vote maximizing parties will not all converge to the electoral mean. In this case will be close to . If is large because, for example, the electoral variance is large, then will be large, suggesting . In this case, the low valence party has an incentive to move away from the origin to increase its vote share. This implies the existence of a centrifugal force pulling some parties away from the origin.
Thus, for sufficiently large so that , we expect parties to diverge from the electoral center. Indeed, we expect those parties that exhibit the lowest valence to move further away from the electoral center, implying that the centrifugal force on parties will be significant. Thus, in fragmented polities with a polarized electorate, the nature of the equilibrium tends to maintain this centrifugal characteristic.
On the contrary, in a polity where there are no very small or low valence parties, will tend to and so will be small. In a polity with small and with low valence differences, so that , we expect all parties to converge to the center. In this case, we expect this centripetal tendency to be maintained.
The convergence coefficient is a way of characterizing the Hessian (the by second derivatives of the vote share function) of party 1 with the lowest valence. The Hessian of the vote share function of party 1 is given by the characteristic matrix Here is a by identity matrix and the other terms are as before. The eigenvalues of determine whether the vote share function of party 1 will be at a maximum, minimum, or at a saddlepoint at the electoral mean. If shows that party 1 is at a minimum or at a saddlepoint at the mean then party 1 has an incentive to locate away from the mean to increase its vote share. When all parties are at the mean and , then all eigenvalues of the Hessian of the vote share function of the lowest valence party are negative indicating that the vote share function is at a maximum. The LNE must then be at the electoral mean.
For an arbitrary dimension, , if in (15), then . In the two-dimensional case, if , then must be positive, implying that both eigenvalues of are negative. It then follows that all have negative eigenvalues, giving a SLNE and thus an LNE at the electoral mean. (This result follows from the application of the triangle inequality to the determinant. A parallel result can be obtained in more than two dimensions.)
The Valence Theorem asserts that if then the party with the lowest valence has an incentive to move away from the electoral mean to increase its vote share. When this is the case then other low valence parties may also find it advantageous to vacate the center. The value of the convergence coefficient, together with the analysis of the Hessians of the low valence parties, allows us to identify which parties have an incentive to move away from the electoral mean. The convergence coefficient then gives an easy and intuitive way to identify whether a low valence party should vacate the electoral mean.
In the next section, we estimate the convergence coefficient of various elections in different countries.
3. MNL Models of the Elections of Various Countries
We use the framework of the spatial model presented in Section 2 as a unifying methodology that allows us to study convergence across elections, countries, and political regimes. The Valence Theorem leads to the convergence coefficient of the election, a summary statistic that determines whether parties converge to or diverge from the electoral mean. Using this formal multinomial (MNL) spatial model, we now estimate the convergence coefficient for the elections in various countries. For each MNL estimation we choose a baseline party and normalize its coefficients to zero, then estimate the coefficients of all other parties relative to those of the base party. Using these coefficients we estimate the convergence coefficient and the characteristic matrix of the low valence parties to determine whether these parties converge to or diverge from the electoral mean in each election for each country. (These elections were studied in depth elsewhere. In this paper, we present only the calculations leading to the convergence coefficient and estimate the confidence intervals for the convergence coefficients that were not provided in earlier work.)
We study convergence under three political regimes (plurality, proportional representation, and anocracy) and group countries according to the similarities of their political regimes. Under plurality rule, we examine elections in two Anglo-Saxon countries: the US and the UK; under proportional representation we study Israel, Turkey, and Poland; and under anocracy, Georgia, Russia, and Azerbaijan. Since we use the same unifying methodology for all countries we present the methodology for the first elections in detail then condense the analysis to its basic components for the remaining countries. For each country we give a general description of the analysis and direct the reader to the full analysis of each election in the detailed country paper. We summarize the results across countries in various tables.
3.1. Convergence in Plurality Systems
We begin our analysis by examining the United States and the United Kingdom. Elections in these countries are carried out under plurality rule. We show that the electoral system in these countries produces relatively low convergence coefficients. (Relative to the convergence coefficient of other countries included in this study. In Section 4 we discuss how the values of the convergence coefficient are related to the political systems under which the countries operate.)
3.1.1. The 2000, 2004, and 2008 Elections in the United States
We construct stochastic models of the 2000, 2004, and 2008 US presidential elections using survey data taken from the American National Election Surveys (ANES). The factor analysis done on ten survey questions taken from the ANES (See Schofield et al. [30, 31] for the list of survey questions and the factor loadings and the full analysis of the US elections.) led us to conclude that voters preferences can be represented along the economic (-axis) and social (-axis) dimensions for all three elections. Voters located on the left of the economic axis are pro-redistribution. The social axis is determined by attitudes to abortion and gays. We interpreted greater values along this axis to mean more support for certain civil rights. Using the factor loadings we estimated each voter’s position in these two dimensions. Figures 1, 2, and 3 give a smoothing of the estimated voter distribution of the 2000, 2004, and 2008 elections, respectively.
Voters’ ideal points in the 2000 US election are characterized by the following electoral covariance matrix: The trace of electoral covariance matrix is . Given the negative covariance between these two dimensions, , the correlation between these two factors is .
Using the spatial model presented in Section 2, we estimated the MNL model of the 2000 election. The coefficients for the US 2000 shown in Table 1 are Bush’s competence valence, , measures the common perception that voters in the sample have on Bush’s ability to govern and represents the nonpolicy component in the voter’s utility function in (2). As seen in Table 1, for the 2000 election Bush has a statistically significant lower valence than Gore, the democratic (baseline) candidate. Bush’s negative valence is an indication that voters regarded him as less able to govern than Gore, once policy differences are taken into account.
To find the convergence coefficient for this election, we assume that all parties locate at the electoral mean so that parties differ only in their valence terms (see Section 2). We can use (14) and the coefficients in (19) to estimate the probability that a typical US voter chooses to vote for the low valence Republican candidate, when both Bush and Gore locate at origin, ; that is, We found the estimate for using the MNL valence estimates. Note that since the central estimates of given by the MNL regressions depend on the sample of voters surveyed then so does . Thus, to make inferences from empirical models we need the 95% confidence bounds of . In the introduction of the appendix we derive the methodology used to find the confidence bounds of . The bounds of are calculated in Appendix A.1.
The results indicate that in the 2000 election, both candidates found it in their best interest to locate at the electoral mean. To see this, we compute the convergence coefficient using (15) and the electoral covariance matrix in (18) to determine whether the two parties converge to, or diverge from, the electoral mean.
Using (19) and (20) we have that and from (18) the trace is so that using (15) the convergence coefficient for 2000 US election is Appendix A.1 shows that is significantly less than 1 implying that meets the sufficient and thus necessary condition for convergence to the electoral mean given in Section 2.
To check whether Bush, the low valence candidate, has an incentive to stay at the electoral origin, , that is, whether Bush’s vote share function is at a maximum at , we use the Hessian or characteristic matrix (of second order conditions) of Bush’s vote share function using (17) at as follows: Because the characteristic matrix for Bush is estimated using the MNL coefficients of the 2000 US sample, depends on the sample of voters surveyed. The confidence bounds on in Appendix A.1 suggest that if Bush positions himself at the electoral origin, then with probability exceeding 95%, his vote share function would be at a maximum. We infer that, with probability exceeding 95%, the origin is an LNE for the spatial model for the 2000 US election. The valence differences between Bush and Gore are not large enough to cause either of them to move from the origin. The unique local Nash equilibrium was one where both candidates converge to the electoral origin in order to maximize their vote shares.
All the components needed to derive the convergence coefficient for 2000 US election and its confidence bounds are summarized in Table 2.
Bush faced Kerry as the democratic candidate in the 2004 US election. The distribution of voters in 2004 gives the following electoral covariance matrix along the economic and social dimensions: While the covariance between economic and social axes differs, the trace is similar to that in the 2000 US election.
From Table 1, the MNL estimates of the spatial model for the 2004 US election are Bush has a significantly lower valence () than Kerry (), the baseline candidate.
From (14) the probability that a US voter chooses Bush, the low valence candidate, when both Bush and Kerry are at the electoral origin, , is The confidence bounds for are given in Appendix A.1. Since Bush’s valence, relative to that of his opponent, was similar in the two elections, it is not surprising that the probability of voting Republican is similar in the two elections, compare (20) and (25). From (15), and , so that the convergence coefficient of the 2004 election is Since is significantly less than 1 (see Appendix A.1), the sufficient condition for convergence given in Section 2 is met. Moreover, from (17) Bush’s characteristic matrix is If Bush positions himself at the electoral origin, then with probability exceeding 95% (see Appendix A.1), his vote share function would be at a maximum. Bush, the low valence candidate, has then no incentive to move from the origin, . With probability exceeding 95%, the mean is an LNE for model of the 2004 US election.
Our analysis suggests that Obama’s victory over McCain in the 2008 US election was the result of an overall shift in the relative valences of the Democratic and Republican candidates as compared to those of the candidates in the 2000 and 2004 elections. The electoral covariance matrix for the sample in 2008 along the economic and social dimensions is Relative to the two previous elections the “variance” of the electoral distribution increased, while the covariance between these dimensions decreased.
The MNL estimates of the spatial model given in Table 1 for the 2008 US election are Obama, the baseline candidate, has a significantly higher valence than McCain.
From (14) the probability that a voter chooses McCain, when both candidates are at the origin, , is From (15), , and , so the convergence coefficient is Appendix A.1 shows that is significantly greater than 1 and significantly less than 2. The Valence Theorem then states that the necessary but not the sufficient condition for convergence has been met. To check whether the low valence candidate, McCain, has an incentive to move from the electoral mean, we examine McCain’s characteristic matrix using (17) to get With probability exceeding 95% (see Appendix A.1) McCain’s vote share function is at a maximum when he locates at the origin, and thus has no incentive to move. Thus, with probability exceeding 95%, the electoral origin is an LNE for the spatial model for the 2008 US election.
In conclusion, Table 2 illustrates that the convergence coefficient varies across elections in the same country even when there are only two parties. This is to be expected as from (15) the convergence coefficient depends on the “variance” of the electoral distribution, ; on the weight voters give to differences with party’s policies, ; and on the probability that a voter chooses the party with the lowest valence, . The electoral distributions of the 2000 and 2004 are quite similar, as can be seen by comparing (18) and (23). Voters’ preferences had however substantially changed by 2008, see (28). The electoral variance along both axes increased relative to 2000 and 2004. While the 2000 and 2004 convergence coefficients are indistinguishable from each other, the 2008 coefficient is significantly different from that in 2000 and 2004. In spite of these differences, candidates in all three elections had no incentive to move from the origin.
3.1.2. The 2005 and 2010 Elections in Great Britain
We study the 2005 and 2010 elections in the UK using the British Election Study (BES). (The full analysis of the 2005 and 2010 elections in Great Britain can be found in Schofield et al. .) The factor analysis conducted on the questions of the two surveys led us to conclude that the same two dimensions mattered in voter choices in the two elections. The first factor deals with issues on “EU membership,” “Immigrants,” “Asylum seekers,” and “Terrorism.” A voter who feels strongly about nationalism has a high value in the nationalism dimension (-axis). Items such as “tax/spend,” “free market,” “international monetary transfer,” “international companies,” and “worry about job loss overseas” have strong influence in the economic (-axis) dimension with higher values indicating a promarket attitude. Figures 4 and 5 present the smoothed electoral distribution obtained from these analyses for the 2005 and 2010 elections.
The electoral covariance matrix for the 2005 UK election is where .
From Table 1, the MNL estimates of the spatial model for the 2005 UK are Both the Labour () and the Conservative () parties had a significantly higher valence than the Liberal Democrats (), the baseline party.
From (14), the probability that a voter chooses the Liberal Democratic Party, the lowest valence party, when all parties locate at the origin, , is
Given that and since in (33), from (15) the convergence coefficient, in Table 2, is Appendix A.1 shows that is significantly less than 1 and thus meets the sufficient and necessary conditions for convergence given in Section 2. From (17) the characteristic matrix of the Liberal Democratic Party is From the 95% confidence bounds in Appendix A.1, we conclude that if the LibDem locates at the origin, it is maximizing its vote share and has no incentive to vacate the center. Thus, with probability exceeding 95%, the origin is an LNE for the 2005 UK election.
The electoral covariance matrix for the 2010 UK election is where , lower than in 2005.
From Table 1, the MNL estimates of the spatial model of the 2010 election are Given the great popular discontent with Gordon Brown, the Labour leader, heading into the 2010 election, it is not surprising to find that both Conservatives and Liberal Democrats (the base party) had significantly higher valences than Labour.
From (14) the probability that a voter chooses Labour, when all parties locate at the origin, , is Since and in (38), from (15) the convergence coefficient, in Table 2, is The convergence coefficient is significantly less than 1 (see Appendix A.1), meeting the sufficient and thus necessary condition for convergence. From (17), Labour’s characteristic matrix is If Labour, the low valence party, locates at the origin, then with probability exceeding 95%, its vote share function is at a maximum (see Appendix A.1) giving it no incentive to move from the mean. Thus, with probability exceeding 95%, the electoral origin is an LNE for the 2010 UK election.
The major shift in voters’ preferences between the two elections led to very different electoral outcomes as evidenced by the electoral covariance matrices in (33) and (38). Voter dissatisfaction with the governing Labour leader led to a dramatic decrease in his competence valence and on the probability of voting Labour. Even though the electoral variance fell in 2010 relative to 2005, the increase in the convergence coefficient meant that this lower variance was more than compensated by the lower probability of voting Labour in 2010. The analysis for the UK elections shows that the convergence coefficient reflects not only changes in the electoral distribution but also changes in voters’ valence preferences as the convergence coefficient of the 2005 election is substantially lower than the one for the 2010 election.
The analysis of these two Anglo-Saxon countries illustrate that even under plurality rule the convergence coefficient varies from election to election and from country to country. The analysis for the 2010 UK election highlights that candidates’ valences matter and that parties understand how their valence affects their electoral prospects and may adjust their positions to increase their votes. This section illustrates that under plurality, the convergence coefficient has low values that generally satisfy the necessary condition for convergence to the mean and is thus below the dimension of the policy space.
3.2. Convergence in Proportional Systems?
We now estimate the convergence coefficients for three parliamentary countries using proportional representation: Israel, Turkey, and Poland. As is well known, these countries are characterized by multiparty elections in which generally no party wins a legislative majority leading then to coalitions governments. This section shows that these countries are characterized by very high convergence coefficients.
3.2.1. The 1996 Election in Israel
In the 1996, as in previous elections, Israel had approximately nineteen parties attaining seats in the Knesset. (These include parties on the left, on the center, on the right, as well as religious parties. On the left there is Labor, Merets, Democrat, Communists and Balad; those on the center include Olim, Third Way, Center, Shinui; those on the right Likud, Gesher, Tsomet and Yisrael. The religious parties are Shas, Yahadut, NRP, Moledet, and Techiya.) There were small parties with 2 seats to moderately large parties such as Likud and Labor whose seat strengths lie in the range 19 to 44, out of a total of 120 Knesset seats. Since Likud and Labour compete for dominance of coalition government, these large parties must maximize their seat strength. Moreover, Israel uses a highly proportional electoral system with close correspondence between seat and vote shares. Thus one can consider vote shares as the maximand and for these parties.
Schofield et al.  performed a factor analysis of the surveys conducted by Arian and Shamir  for the 1996 Israeli election. The two dimensions identified by the factor analysis were Security (-axis) and Religion (-axis). “Security” refers to attitudes toward peace initiatives; “religion” to the significance of religious considerations in government policy. A voter on the left of the security axis is interpreted as supporting negotiations with the PLO, while higher values on the religious axis indicates support for the importance of the Jewish faith in Israel. The distribution of voters is shown in Figure 6.
Voter distribution along these two axes gives the following covariance matrix: giving a “variance” of .
Only the seven largest parties are included in the MNL estimation. These include Likud, Labor, NRP, Moledat, Third Way (TW), and Shas with Meretz being the base party. From Table 2, the MNL coefficients for the 1996 election in Israel () are The -coefficient and the valence estimates for all parties are significantly nonzero. The two largest parties, Likud and Labour, have significantly higher valences than the other smaller parties with Third Way (TW) having the smallest valence.
From (14), the probability that an Israeli votes for TW, when all parties locate at the mean is
The 95% confidence intervals for in Appendix A.2 confirm that the necessary condition is not satisfied as is significantly higher than 2, the dimension of the policy space. Moreover, at the electoral mean the vote share function of Third Way is not at a maximum since its Hessian from (17) shows that if TW locates at the mean its vote share function is at a saddlepoint since has one positive (2.453) and one negative eigenvalue. Appendix A.2 confirms that has one negative and one positive eigenvalue at both its lower and upper bounds. Thus, with a high degree of certainty TW deviates from the mean to maximize its votes and the electoral mean is not a LNE for the 1996 Israeli election.
3.2.2. The 1999 and 2002 Elections in Turkey
We used factor analysis of electoral survey data of Veri Arastima for TUSES to study the 1999 and 2002 Turkish elections. (See Schofield et al.  for details of the estimation.) The analysis indicates that voters made decisions in a two-dimensional space during the two elections. Voters who support secularism or “Kemalism” are placed on the left of the Religious () axis and those supporting Turkish nationalism () to the north. Figures 7 and 8 give the distribution of voters along these two dimensions surveyed in these two elections.
Minor differences between these two figures include the disappearance of the Virtue Party (FP) which was banned by the Constitutional Court in 2001 and the change of the name of the pro-Kurdish party from HADEP to DEHAP. (For simplicity, the pro-Kurdish party is denoted HADEP in the various figures and tables. Notice that the HADEP position in Figures 8 and 9 is interpreted as secular and nonnationalistic.) The most important change is the emergence of the new Justice and Development Party (AKP) in 2002, essentially substituting for the outlawed Virtue Party.
The parties included in the analysis of the 1999 election are the Democratic Left Party (DSP), the National Action party (MHP), the Vitue Party (VP), the Motherland Party (ANAP), the True Path Party (DYP), the Republican People’s Party (CHP), and the People’s Democratic Party (HADEP). A DSP minority government formed, supported by ANAP and DYP. This only lasted about 4 months and was replaced by a DSP-ANAP-MHP coalition, indicating the difficulty of negotiating a coalition compromise across the disparate policy positions of the coalition members.
In the 1999 election, the electoral covariance matrix along the Religious () and Nationalism () axes is with .
Using DYP as the base party, from Table 3, the 1999 MNL coefficients are The -coefficient and the valence estimates of DSP and MHP and CHP are significantly nonzero. The probability that a Turkish voter chooses FP with lowest valence in 1999, when all parties locate at the mean, in (14), is
Given that and since in (48), then using (15), Turkey’s convergence coefficient in 1999, in Table 4, is The convergence coefficient is significantly higher that 1 and significantly lower than 2 (see Appendix A.2). From (17) FP’s Hessian at the origin is When at the electoral origin, FP’s characteristic function shows that its vote share function is at a saddlepoint as the eigenvalues of are with minor eigenvector and with major eigenvector . Moreover, as seen in Appendix A.2, the 95% confidence bounds show that at the lower bound of FP has no incentive to move but it does at the upper bound. Since FP wants to move at the central estimate of in (52) it is probable that in general FP wants to move away from the mean to increase its vote share. Moreover, since the convergence coefficient is significantly greater than 2, then with a high degree confidence, the electoral mean cannot be a LNE for Turkey in 1999.
The electoral covariance matrix of the 2002 Turkish election is with .
Note that the covariance matrix of 1999 in (48) and that of 2002 in (53) suggest few changes in the distribution of voters between these two election. Figures 8 and 9 suggest that there were few changes in party positions between these two elections. The basis of support for the AKP may be regarded as similar to that of the banned FP, suggesting that the leader of this party changed the party’s position on the religion axis, adopting a much less radical position. One would think of this as generating political stability in Turkey. Yet, between 1999 and 2002, Turkey experienced two severe economic crises and in 2002, a 10% electoral cut-off rule was instituted. The crises and the cut-off rule changed the political landscape in Turkey. In the 2002 election, seven parties obtained less than 10% of the vote and won no seats. The AKP won 34% of the vote, and due to the cut-off rule, obtained a majority of the seats (363 out of 550).
Our analysis reflects this change in the political landscape. Using DYP as the base party, from Table 3, the 2002 MNL coefficients are The -coefficient and the valences of AKP and CHP are significantly nonzero with ANAP having the lowest valence. The probability of voting ANAP, when parties locate at the mean, in (14), is
Given that and since from (53), then using (15) we find that the 2002 convergence coefficient for Turkey, in Table 4, is The political changes induced by the cut-off rule led to a higher convergence coefficient in 2002 relative to 1999 (increasing from a low of in (51) to a high in (56)). An indication that a more fractionalized polity emerged from this reform. The convergence coefficient of the 2002 election is significantly above 2, the dimension of the policy space (see Appendix A.2) giving ANAP an incentive to locate far from the mean. ANAP’s characteristic matrix using (17) is When at the origin, indicates that ANAP is minimizing its vote share since its eigenvalues are both positive (0.090 and 3.850). This together with the 95% confidence bounds in Appendix A.2 implies that there is a high probability that ANAP will vacate the center and that the mean is not an LNE for Turkey in 2002.
3.2.3. The 1997 Polish Election
In the election held in Poland in 1997 (In this election Poland used an open-list proportional representation electoral system with a threshold of 5% nationwide vote for parties and 8% for electoral coalitions. Votes are translated into seats using the D’Hondt method.) the following five parties won seats in the Sejm (lower house). The left-wing excommunist Democratic Left Alliance (SLD) and the agrarian Polish Peoples’ Party (PSL), both of which have been the most frequent governing parties in the postcommunist period. The Freedom Union (UW) and the Solidarity Election Action (AWS) had grown out of the Solidarity movement. AWS combined various mostly right wing and Christian groups under one label, while UW was formed based on the liberal wing of Solidarity. The remaining party is the Movement for Reconstruction of Poland (ROP).
Applying factor analysis to questions from the Polish National Election Survey an economic and a social value dimensions were identified (see ). The economic dimension is influenced by issues such as privatization versus state ownership of enterprises, fighting unemployment versus keeping inflation and government expenditure under control, proportional versus flat income tax, support versus opposition to state subsidies to agriculture, and state versus individual social responsibility. The separation of church and state versus the influence of church over politics, complete decommunization versus equal rights for former nomenclature, and abortion rights regardless of situation versus no such rights regardless of situation are the most influential issues in this social values dimension. The distribution of voters along these dimensions is seen in Figure 9. (See Schofield et al.  for details of the estimation.)
The covariance matrix for the 1997 Polish () election is with variance .
From Table 3, the MNL coefficients for the 1997 election are The -coefficient and valence estimates for all parties except UP and PSL are significantly nonzero. The probability of voting UPR with lowest valence, in 1997, when parties locate at the mean, in (14), is
Given that and since from (58), then using (15) the convergence coefficient for Poland, in Table 4, is Appendix A.2 shows that is significantly greater than 2 and thus fails the necessary condition for convergence to the mean. UPR’s Hessian from (17) is The trace (= 3.82), the determinant (= 5.80), and the eigenvalues of are positive. The 95% confidence bound of in Appendix A.2 also shows positive eigenvalues at the lower and upper bounds of . Thus, with a high degree of certainty UPR locates far from the origin to maximize its votes and the electoral mean is not a LNE for 1997 Polish election.
Summarizing, in this section we examined three countries that use proportional representation. Their convergence coefficients are significantly higher than 2, the dimension of the policy space and are also much higher than that of the US and the UK. A high convergence coefficient signals then a high degree of political fractionalization in these multi-party parliamentary democracies.
3.3. Convergence in Anocracies
We now study elections in Georgia, Russia, and Azerbaijan. In these partial democracies or anocracies, (The term “partial democracy” has been applied to new democracies lacking the full array of democratic institutions present in western democracies (see .)) the President/autocrat holds regular presidential and legislative elections while exerting undue influence on the elections. Anocracies lack important democratic institutions such as freedom of the press. Autocrats hold regular elections in an attempt to give their regime legitimacy. The autocrat “buys” legitimacy by rewarding their supporters and opposition members with well-paid legislative positions and give legislators the ability to influence policies. Opposition parties participate in elections to become known political entities. This allows them to regularly communicate with voters. Their objective is to oust the autocrat either in a future election or through popular uprisings. We assume that opposition parties maximize their vote share even when understanding that there is little chance of ousting the autocrat in the election.
3.3.1. The 2008 Georgian Election
We use the postelection survey conducted by GORBI-GALLUP International from March 19 through April 3, 2008, to built a formal model of the 2008 election in Georgia (see ). The factor analysis done on the survey questions determined that there were two dimensions describing voters’ attitudes towards democracy and the west. One dimension is strongly related with the respondents’ attitude toward the US, the EU and NATO with larger values in the West (-axis) dimension implying a stronger anti-western attitude. Along the democracy (-axis) dimension larger values are associated with negative judgements on the current state of democratic institutions in Georgia, coupled with a demand for more democracy. The electoral distribution along these two dimensions is given in Figure 10. The points (S, G, P, N) in Figure 10 represent the estimated positions of the four candidates: Saakashvili (S), Gachechiladze (G), Patarkatsishvili (P), and Natelashvili (N). (See Schofield et al.  for details of the estimation.)
The 2008 electoral covariance matrix in the Democracy () and West () axes is with .
From Table 5, the MNL estimates of the 2008 election with Natelashvili as the base candidate are All coefficients are significantly nonzero showing Natelashvili as having the lowest valence.
The probability that a Georgian votes for Natelashvili, when all candidates locate at the mean, is
As shown in Appendix A.3, is not significantly different from 2 and thus fails the necessary condition for convergence to the mean. Natelashvili’s Hessian or characteristic matrix, from (17), is Since the eigenvalues of are both positive (), Natelashvili’s vote share function is at a minimum when he is at the mean and has an incentive to move to increase his vote share. This together with the analysis of the 95% confidence intervals of in Appendix A.3 shows that with a high degree of certainty Natelashvili will locate far from the mean. This is not surprising since Georgians managed to induce three major changes in government through mass protests prior to this election. Thus, with a high degree of certainty Natelashvili locates far from the origin in this election and the electoral mean cannot be an LNE for the 2008 Georgian election.
3.3.2. The 2007 Russian Election
The analysis of the 2007 Russian election concentrates on four parties: the pro-Kremlin United Russia party (ER), Liberal Democratic Party (LDPR), Communist Party (CPRF), and Fair Russia (SR). Voters’ ideological preferences were measured according to two questions taken from the survey conducted by VCIOM (Russian Public Opinion Research Center) in May 2007 (see ). The first dimension gives a measure of voters general (dis)satisfaction (-axis). High values in this dimension correspond to negative feelings toward “justice,” “labor” and, to a lesser extent, “order,” “state,” “stability,” and “equality.” Also, those with high values of the first axis tend to feel neutral toward order, elite, West, and non-Russians. The second dimension measures the voter’s degree of economic liberalism (-axis). High values correspond to positive feelings to “freedom,” “business,” “capitalism,” “well-being,” “success,” and “progress,” and to negative feelings toward “communism,” “socialism,” “USSR,” and related concepts. The distribution of voter preferences along these two dimensions can be seen in Figure 11. (See Schofield and Zakharov  for details of the estimation.)
The 2007 electoral covariance matrix along the (dis) satisfaction () and economic liberalism () axes is with .
From Table 5, the MNL estimates of the spatial model for Russia are Distance and all valences, except for that of the LDPR party, are significantly nonzero. When parties locate at the mean, the probability that a Russian votes for Fair Russia (SR) with lowest valence, from (14) is
Given that and since from (68), then using (15) Russia’s convergence coefficient, in Table 6, is Since is not significantly different from 2 (see Appendix A.3), the necessary condition for convergence is not met. The characteristic matrix or Hessian of Fair Russia (SR) from (17) is The eigenvalues are both negative (), implying that at this central estimate Fair Russia is maximizing its vote share and thus has no incentive to vacate the origin. This conclusion holds at the lower 95% bound of in Appendix A.3. However, at the upper bound of Fair Russia is minimizing its vote share. It seems then that with the Russian President and his party exerting much influence over the election and Putin being so popular that Fair Russia is more likely to remain at the origin. (This result however highlights that unexpected political events could prompt Fair Russia to move from the origin.) It is then likely that the electoral mean is a LNE for the 2007 Russian election.
3.3.3. The 2010 Election in Azerbaijan
In the 2010 election in Azerbaijan, 2,500 candidates filed application to run in the election, but only 690 were given permission by the electoral commission. The parties that competed in the election were the Yeni Azerbaijan Party (the party of the President, YAP), Civic Solidarity Party (VHP), Motherland Party (AVP), Azerbaijan Popular Front Party (AXCP), and Musavat (MP). Various small parties formed political blocks.
President Ilham Aliyev’s ruling Yeni Azerbaijan Party took a majority of 72 out of 125 seats. Nominally independent candidates, who were aligned with the government, received 38 seats, and 10 small opposition or quasiopposition parties took 10 seats. The Democratic Reforms party, Great Creation, the Movement for National Rebirth, Umid, Civic Welfare, Adalet (Justice), and the Popular Front of United Azerbaijan most of which were represented in the previous parliament, won one seat a piece. Civic Solidarity retained its 3 seats and Ana Vaten kept the 2 seats they had in the previous legislature. For the first time, not a single candidate from the opposition Azerbaijan Popular Front (AXCP) or Musavat were elected.
We organized a small preelection survey of 2010 election in Azerbaijan allowing us to construct a model of the election (see ). For VHP and AVP, the estimation of their party positions was very sensitive to inclusion or exclusion of one respondent. Thus, we used only the small subset of 149 voters who completed the factor analysis questions and intended to vote for YAP or the AXCP+MP coalition.
The factor analysis showed that voters were only concerned with one dimension: the “demand for democracy” with higher values being associated with voters who had a negative evaluation of the current democratic situation in Azerbaijan, who did not think that free opinion is allowed, had a low degree of trust in key national political institutions, and expected that the 2010 parliamentary election would be undemocratic. Figure 12 shows the distribution of voters and the party positions at the mean of their supporters. (See  for details of the estimation.) In this one dimensional model the variance is
The binomial logit estimates for the 2010 election with AXCP-MP as the base party, in Table 5, are All coefficients are significantly nonzero with AXCP-MP having the lowest valence. If these two parties locate at the mean, the probability that an Azerbaijani votes AXCP-MP from (14) is
Given that and since from (73), then using (15) the convergence coefficient for Azerbaijan, in Table 6, is Given that is not significantly different from 1, the dimension of the policy space (see Appendix A.3) and the necessary condition for convergence is not met. The one dimensional Hessian of AXCP-MP from (17) is Clearly, has a single positive eigenvalue indicating the AXCP+MP is minimizing its vote share at the origin. The 95% bounds of in Appendix A.3 shows that this matrix has positive eigenvalues at the lower and upper bounds of the confidence interval. Thus, with a high degree of certainty AXCP+MP will deviate from the origin and the electoral mean is not a LNE for the 2010 election in Azerbaijan.
This section illustrates that for the three anocracies that we consider the convergence coefficient does not satisfy the necessary condition for convergence to the mean. That is, these convergence coefficients are not significantly different from the dimension of the policy space. As a consequence, parties are at a knife-edge equilibrium. Under some conditions, parties converge to the mean, under others they diverge. Which equilibrium materializes depends on how popular or unpopular the President/autocrat and his party are and so depends on the valence of all parties and on how dispersed voters are in the policy space. Thus any change in valence can substantially affect party positions.
4. Convergence across Political Systems
In the previous sections we used the unifying framework of Schofield’s  stochastic electoral model outlined in Section 2 to study whether parties locate near or far from the electoral mean for countries with plurality and proportional representation systems and in anocracies. Using this framework we estimated the convergence coefficient for various elections in different countries. We will now use this dimensionless coefficient to compare convergence to the electoral mean across elections, countries, and political systems. We can then illustrate the use of the convergence coefficient to classify political systems. Table 7 presents a summary of the convergence coefficients across elections, countries, and political systems that we now discuss.
As Table 7 indicates the two countries using plurality systems (the US and the UK) studied in Section 3.1 meet the conditions for convergence to the mean. Thus, suggesting that plurality rule imposes a strong centripetal tendency that keeps parties close to the mean. Our analysis suggests that in countries with plurality systems the convergence coefficient will be low at or below the dimension of the policy space.
Of the anocratic countries that we studied in Section 3.3, Georgia seems to have the highest convergence coefficient, in (66) which is not different from 2, suggesting that parties can diverge from the mean. (Note that prior to 2008 Georgians had already brought about three major political changes through mass popular revolt. This rebellious “tradition” may give opposition candidates the ability to position themselves away from the mean.) The convergence coefficient of all three anocracies was not significantly different than the dimension of the policy space [2 for Georgia and Russia and 1 for Azerbaijan: given in (66), in (71), and in (76)]. These results suggest that convergence in anocracies is fragile and depends on the distribution of voters’ preferences as well as on the valences of the autocrat and the opposition parties.
The countries with proportional systems studied in Section 3.2 have convergence coefficients that are significantly above their two-dimensional policy space signalling the lack of convergence of small valence parties to the electoral mean (from Table 7, Israel’s in (46), Turkey’s in (51) in 1999, and in (56) in 2002 and Poland’s in (61)). Having no possibility of forming government, these small parties maximize their vote shares by locating closer to their core supporters. Elections lead to multiparty legislatures producing a highly fragmented party system where coalition governments are the norm. Note that changes to the electoral process in Turkey between 1999 and 2002 forced parties to move from locating close to the mean in 1999 to diverging towards their partisan constituencies so as to increase their vote shares in 2002. These results suggest that in countries with proportional systems, with highly fragmented political parties, divergence from the mean is the norm.
We can explain the lack of convergence to the mean in proportional systems with multiparty (>3) legislatures by noting that the convergence coefficient in (15) depends on fundamental characteristics of the electorate. These characteristics include the weight given by voters to the distance to the parties’ positions, ; the electoral variance, in (16); and the probability that a voter chooses the lowest valence party, in (14). Thus, in countries with many parties, the smallest low valence parties have little chance of receiving much support, a low . If, in addition, voters care a lot about policy differences (a high ) and if the electorate is very dispersed (a high ), then small parties will have an incentive to move towards their core supporters and away from the mean. That is, in highly fragmented polities where voters and correspondingly parties are very dispersed, we observe high convergence coefficients.
In essence, Schofield’s  Valence theorem gives a simple summary statistic, the convergence coefficient, that measures the degree of fragmentation, or lack thereof, in each polity. Poland is an extreme case of this fragmentation and correspondingly has a very high convergence coefficient (see Table 7).
The are other measures of political fragmentation in the literature. The effective number of party vote strength (env) used by Laakso and Taagepera  serves to measure how many dominant parties there are in a polity a given election. To find the env, let the Herfindahl index of the election be given by where is the vote share of party for . This Herfindahl index gives a measure of the party size in an election and measures how competitive the election was. Laakso and Taagepera’s effective number of party vote strength is then the inverse of ; that is, In the same way we can define the effective number of party seat strength () using seat shares instead of vote shares giving us a measure of the strength of parties in a legislature.
We calculate the and for each election we consider (see Table 7) using all the parties that obtained votes in each election and exclude parties that ran in the election but that got no votes. We now compare the level of fragmentation given by the and with that given by the convergence coefficient for each country and each election under the three political systems that we studied.
We first examine countries with plurality rule. In Table 7 we see that for the US, the and the at the Presidential and House levels are closely aligned. There is little variation between the and indices in the three elections. According to these indices there is essentially no change in political fragmentation across these three elections. The convergence coefficient however rises in 2008 relative to 2000 and 2004 indicating that in 2008 the dispersion among voters was higher than in the previous two elections. For the US, the convergence coefficient provides more information than do or . For the UK, the convergence coefficient shows that the electorate was more dispersed in 2010 than in 2005 (see Tables 2 and 7). This dispersion led to the first minority government since 1974 which resulted in higher effective number of parties as measured by the and . All three measures, , , and , indicate that the United Kingdom became more fragmented in 2010. Thus, in the countries using plurality, the convergence coefficient tends to provide more information than the and numbers do as the convergence coefficient takes into account the degree of dispersion among the electorate and the valence of parties.
Polities with high convergence coefficients (Israel, Turkey in 2002 and Poland in Table 7) had a large number of parties competing in these elections. The greater the number of parties obtaining votes, and thus effectively competing in the election, led to large values. These elections produced highly fragmented legislatures leading to very high values. Having a large number of effective parties competing in the election and greater effective number of parties in the legislature does not necessarily translate into a higher convergence coefficient. The convergence coefficient is lower for Israel with a larger number of effective parties (higher and ) than for Poland with fewer parties. Changes in the Turkish electoral system between 1999 and 2002 in which a minimum cut-off rule has instituted led to a high but a low . Small parties were however able to gain enough votes leading to a high convergence coefficient, an indication that these parties would disperse themselves in the policy space. The and values of the 2002 Turkish election show high party fragmentation but no legislative fragmentation. This shows that these three measures of fragmentation provide different information about a particular election.
The convergence coefficient suggests that a way of interpreting the arguments of Duverger  and Riker  on the effects of proportional electoral methods on electoral outcomes: the strong centrifugal tendency pulling all parties away from the electoral mean towards their core constituency. This tendency will be particularly strong for small, or low valence, parties. In particular, even small parties in such a polity can assign a nonnegligible probability to becoming a member of a coalition government, and it is this phenomenon that maintains the fragmentation of the party system. For example, in Poland no party can obtain a majority and parties and coalitions regularly form and dissolve. In general the convergence coefficients in Poland were of the order of 6.0 in the elections in the 1990’s.
For countries using proportional representation, while the and give a measure of electoral and legislative dispersion, the convergence coefficient provides a measure that summarizes dispersion across voters and parties in the policy space.
In the anocratic countries studied, the convergence coefficient seems in line with the in presidential elections but going in the opposite direction in parliamentary elections (see Table 7). In these countries, the convergence coefficient does not meet the necessary condition for convergence to the mean. These countries that we study show that parties could either converge to or diverge from the mean under anocracy as the equilibrium is fragile. Changes in valences, for example, of the autocrat or in voters’ preferences, can lead small valence opposition parties to diverge from the mean and to mount popular uprisings as happened in previous elections in Georgia or in recent Arab uprisings.
The convergence coefficient reflects information that the and cannot capture as it reflects the preferences of the electorate through the policy weight, ; the perceived ability of parties or candidates to govern as captured by their valences ; and the dispersion of voters’ preferences in the policy space, . All of which are not taken into account in the and . Moreover, and have nothing to say about the dispersion in parties’ positions relative to the mean.
The analysis carried out in this section suggests that there is an inverse relationship between the degree of fractionalization in a polity and the convergence coefficient. By our interpretation of the nature of the convergence coefficient, the convergence effect in presidential elections in the United States is stronger than in parliamentary elections in Great Britain. That is, our results suggest that democratic presidential systems have fewer parties and a low convergence coefficient. Parliamentary democracies operating under plurality rule tend to have more parties than presidential democracies and a somewhat higher convergence coefficient. Parliamentary democracies operating under proportional representation tend to have multiparty legislatures and high convergence coefficients. Anocratic countries tend to have multiple parties competing in the election but low convergence coefficients as opposition parties remain close to the electoral mean when Presidents/autocrats have high valences and diverge when they do not.
In this paper, Schofield’s  Valence Theorem together with multinomial logit models of elections are used as a unifying framework to compare the convergence properties of parties across elections, countries, and political systems. We found evidence to support the hypothesis that in countries with proportional representation parties located away from the electoral mean.
We relate the convergence coefficient to the effective number of parties according to both vote () and seat () shares and showed how the characteristics of the electorate and the political regime under which parties operate. Then, compare the convergence coefficient to the fractionalization measures provided by the and . The advantage of the convergence coefficient is that it is a summary statistic that incorporates the preferences of voters, the valence of parties, and the dispersion of voters and parties in the policy space.
A. Confidence Intervals
Schofield’s  Valence Theorem, presented in Section 2, perfectly predicts whether parties converge to or diverge from the electoral origin. Convergence or divergence depends on the value of the convergence coefficient in (15) and on the Characteristic matrix of party 1 with lowest valence, in (17). Both and depend on and on in (14).
The central estimate of and of given by the MNL regressions depend on the sample of voters surveyed as do , , and . Thus, to make inferences from empirical models we need the 95% confidence bounds of these estimates. Using these bounds we assert with some degree of certainty whether parties converge to or diverge from the electoral mean or if there is a knife-edge unstable equilibrium.
To build these bounds, we could perform simulations of the election. For each simulation we could generate the value of , , , and . Repeating the simulation many times would generate their distribution from which we could derive their 95% confidence bounds. Note that and increase in and decrease in . So that given the electoral covariance matrix and variance/trace in (16) of an election, when in a simulation has a low value and a high one, the values of and are low with the opposite being true when is high and is low. Since we have not performed simulations for the elections in this study, we use these features of and to generate our confidence bounds.
Let identify the lower and the upper bounds of the 95% confidence intervals of any estimate. The MNL estimation for an election gives the confidence bounds of and , and . To estimate the bounds on in (14), , we use the bounds on and Taylor’s Theorem, which asserts that
Using (15) and the bounds on and we build the confidence intervals for the convergence coefficient as follows. In (15) use and to get the lower bound of , , and use and for the upper bound of , . The 95% confidence interval of the convergence coefficient is then
Following a similar procedure we estimate the bounds for using (17) and the corresponding bounds of and to get the bounds for the Hessian of the lowest valence party