Application of a Theorem in Stochastic Models of Elections
Previous empirical research has developed stochastic electoral models for Israel, Turkey, and other polities. The work suggests that convergence to an electoral center (often predicted by electoral models) is a nongeneric phenomenon. In an attempt to explain nonconvergence, a formal model based on intrinsic valence is presented. This theory showed that there are necessary and sufficient conditions for convergence. The necessary condition is that a convergence coefficient c is bounded above by the dimension w of the policy space, while a sufficient condition is that the coefficient is bounded above by 1. This coefficient is defined in terms of the difference in exogenous valences, the “spatial coefficient”, and the electoral variance. The theoretical model is then applied to empirical analyses of elections in the United States and Britain. These empirical models include sociodemographic valence and electoral perceptions of character trait. It is shown that the model implies convergence to positions close to the electoral origin. To explain party divergence, the model is then extended to incorporate activist valences. This extension gives a first-order balance condition that allows the party to calculate the optimal marginal condition to maximize vote share. We argue that the equilibrium positions of presidential candidates in US elections and by party leaders in British elections are principally due to the influence of activists, rather than the centripetal effect of the electorate.
Electoral models based on the work of Hotelling  and Downs  suggest that parties will converge to an electoral center (at the electoral median) when the policy space has a single dimension. Although a pure strategy Nash equilibrium generically fails to exist in competition between two agents under majority rule in high enough dimension, there will exist mixed-strategy equilibria whose support is located near to the electoral center . However, previous empirical research has developed stochastic electoral models for Argentina, Israel, Russia, Turkey, and other polities [4–10], and has suggested that divergence from the electoral center is a generic property of electoral systems.
This paper presents the formal stochastic model based on electoral valence to explain nonconvergence of candidates in the 2008 elections in the United States, and in an earlier elction in Britain. The key idea is that the convergence result need not hold if there is an asymmetry in the electoral perception of the “quality” of party leaders [11, 12]. The average weight given to the perceived quality of the leader of the party is called the party's intrinsic (or exogenous) valence. In empirical models, a party's valence is assumed to be independent of the party's position, and adds to the statistical significance of the model. It is obtained from the intercept of the empirical model, and reflects a common perception of the quality of the candidate or party leader. In general, intrinsic valence reflects the overall degree to which the party or candidate is generally perceived to be able to govern effectively [13, 14].
We assume here that, in addition to intrinsic valence, there are three further kinds of valence. The first kind is a sociodemographic valence. Empirical models show that different subgroups in the electorate respond to leaders or candidates in different ways. These sociodemographic valences reflect the fact that particular party leaders have established specific political relationships with various political groups that are, at least in the short run, independent of the party's position.
The second type of valence is individual specific, and is defined by individual perception of the character traits of the candidates or party leaders.
The third kind of valence is called activist (or endogenous) valence. When party adopts a policy position , in the policy space, then the activist valence of the party is denoted as Implicitly we adopt a model originally due to Aldrich . In this model, activists provide crucial resources of time and money to their chosen party, and these resources are dependent on the party position. (For convenience, it is assumed that is only dependent on , and not on but this is not a crucial assumption.) The party then uses these resources to enhance its image before the electorate, thus affecting its overall valence. Although activist valence is affected by party position, it does not operate in the usual way by influencing voter choice through the distance between a voter's preferred policy position, say and the party position. Rather, as party 's activist support, increases due to increased contributions to the party in contrast to the support received by party then (in the model) all voters become more likely to support party over party
However, activists are likely to be more extreme than the typical voter. By choosing a policy position to maximize activist support, the party will lose centrist voters. The party must therefore determine the “optimal marginal condition” to maximize vote share. The first result presented here gives this as a (first-order) balance condition. Moreover, because activist support is denominated in terms of time and money, it is reasonable to suppose that the activist function will exhibit decreasing returns. We point out that when these activist functions are sufficiently concave, then the model will exhibit a Nash equilibrium, where each party or political candidate adopts a position that maximizes its vote, in response to the positions adopted by the other agents.
This stochastic model is also applied to the case with intrinsic valence alone. For this model, it can be shown that the joint electoral origin satisfies the first-order condition for a Nash equilibrium. Because the vote share functions are differentiable, we make use of calculus techniques, and therefore use the notion of local Nash equilibrium (LNE). To determine whether the origin is an LNE, it is necessary to examine the Hessian of the vote share function of the political agent with lowest intrinsic valence. We thus obtain the necessary and sufficient conditions for the validity of the mean voter theorem that all agents should converge to the electoral origin. The second result gives these conditions in terms of a “convergence coefficient” incorporating all the parameters of the intrinsic valence model. This coefficient, involves the differences in the intrinsic valences of the agents, and the “spatial coefficient” When the policy space, is assumed to be of dimension then the necessary condition for existence of a Nash equilibrium when all agents are located at the electoral origin is that the coefficient is bounded above by When the necessary condition fails, then agents, in equilibrium, will adopt divergent positions.
In the next section we briefly sketch the nature of the local Nash equilibria in political games involving these different types of electoral valences. We focus on candidates in the 2008 US Presidential election, in order to illustrate our results. The formal models are presented in Section 3. There we formally introduce the notion of a local Nash equilibrium, and then show that the unique Nash equilibrium in the presidential campaign of 2008 should be where both candidates adopted positions very close to the electoral origin. Since the candidates did not adopt such convergent positions, we can estimate the effect of activists in this election. We then follow up with a brief analysis of the 1979 general election in Britain, and show again that the empirical model indicated convergence. Again, nonconvergence of the parties allows us to estimate the effect of activists. In the conclusion, we offer some general remarks about Madison's argument about the “probability of a fit choice.”
2. Activist Support for the Parties
The main result of this paper can be applied to analysis of the equilibrium candidate positions ( in a two-candidate game of vote maximization in a US election. It is shown here that the the first-order condition is given by a balance equation. This means that, for each party or there is a weighted electoral mean for party , given by the expression and which is determined by the set of voter preferred points . Notice that the coefficients for candidate will depend on the position of the other candidate, Define the centripetal marginal electoral pull for candidate at , by The influence of activists on candidate is given by the marginal activist pull for party
The first-order balance equation for equilibrium is that the position for each must satisfy the gradient equation The locus of points satisfying this equation is called the balance locus for the party.
To illustrate this model, consider Figure 1 which illustrates elections in the US. Our empirical analysis indicates that there are two dimensions, economic and social. Consider initial positions and , on either side of and approximately equidistant from the origin, as in the figure. Both social conservative activists, represented by and social liberal activists represented by would be indifferent between both parties. A Democratic candidate by moving to position * will benefit from activist support of the social liberals, but will lose some support from the economic liberal activists at . The “contract curve” between the two activist groups, centered at and , represents the set of conflicting interests or “bargains” that can be made between these two groups over the policy to be followed by the candidate. In the figure, the indifference curves of the activist groups are shown to be eccentric, with economic activists much less concerned about social policy, and social activists less concerned about economic policy. Under this assumption, it can be shown that this contract curve is a catenary whose curvature is determined by the “eccentricities” of the utility functions of the activist groups. We therefore call this contract curve the Democratic activist catenary. It is obtained by shifting the appropriate activist catenary towards the weighted electoral mean of the party The marginal activist pull for party (at a position ) is a gradient vector, which represents the marginal effect of the activist groups on the party's valence. The gradient term is the marginal electoral pull of party (at ), and this pull is zero at . Otherwise, it is a vector pointing towards
To illustrate, the pair of positions (*, *) in Figure 1 are equilibrium candidate positions that maximize each candidate's vote share.
The positioning of * in the lower right electoral quadrant in Figure 1 and of * in the upper left quadrant is meant to indicate the realignment that has occurred since the election victory of Kennedy over Nixon in 1960. By 1964 Lyndon Johnson had moved away from a typical New Deal Democratic position, to a position comparable to *. The long-term effect of this transformation was that by 2000; most of the southern states had become dominated by the Republican party. Empirical analysis of this election suggests that the intrinsic valence of Johnson was greater than that of Goldwater. (See [8, 16].) According to the activist model, this implies that Goldwater's dependence on activist support was greater than Johnson's. This is reflected in Figure 1, where the balance locus for Goldwater is shown to be further from the electoral origin than the balance locus for Johnson. From this we can infer the influence of activists on the two-candidates, thus providing an explanation why socially conservative activists responded so vigorously to the new Republican position adopted by Goldwater, and came to dominate the Republican primaries in support of his proposed policies. These characteristics of the balance solution appear to provide an explanation for Johnson's electoral landslide in 1964.
In this paper we shall apply the electoral model to account for the positions of Obama and McCain in the 2008 presidential election in the context of an electoral distribution, obtained from the American National Election Survey (ANES). Figure 2 shows the estimated voter distribution together with these estimated candidate positions.
We first present the formal stochastic model and then give the empirical analysis of this election.
3. The Formal Stochastic Model
Details of the spatial stochastic electoral models are published in [17, 18]. This model is an extension of the standard multiparty stochastic model  modified by inducing asymmetries in terms of valence.
We define a stochastic electoral model, which utilizes sociodemographic variables and voter perceptions of character traits. For this model we assume that voter utility is given by the expression Here is the observable component of utility, while is the intrinsic valence vector, which we assume satisfies the ranking condition
The political agents (who may be presidential candidates, in US elections, or party leaders, as in British elections) are denoted as . The points are the preferred policies, in a space of the voters and are the positions, in of the agents. The term is simply the Euclidean distance between and The error vector is distributed by the type I extreme value distribution, as assumed in empirical conditional logit estimation. In empirical models, the valence vector is given by the intercept term for each agent in the model. The symbol denotes a set of -vectors representing the effect of the different sociodemographic parameters (class, domicile, education, income, religious orientation, etc.) on voting for agent while is a -vector denoting the individual's relevant “sociodemographic” characteristics. The compositions are scalar products, called the sociodemographic valences for .
The terms are scalars giving voter s perceptions and beliefs. These can include perceptions of the character traits of agent or beliefs about the state of the economy, and so forth. We let A trait score can be obtained by factor analysis from a set of survey questions asking respondents about the traits of the agent, including “moral”, “caring”, “knowledgable”, “strong”, “honest”, “intelligent”, and so forth. The perception of traits can be augmented with voter perception of the state of the economy, in order to examine how anticipated changes in the economy affect each agent's electoral support.
The terms are the activist valence functions The full model including activists is denoted as .
Partial models are:(i)pure sociodemographic, denoted as , with only intrinsic valence and sociodemographic variables,(ii)pure spatial, denoted as with only intrinsic valence and (iii)joint spatial, denoted as with intrinsic valence, sociodemographic variables and (iv)joint spatial model with traits, denoted as without the activist components.
In all models, the probability that voter chooses agent when agent positions are given by , is
A strict local Nash equilibrium (LNE) for a model is a vector, such that each agent, chooses to locally strictly maximize the expected vote share
In these models, political agents cannot know precisely how each voter will choose at the vector . The stochastic component as described by the vector is one way of modeling the degree of risk or uncertainty in the agents' calculations. Implicitly we assume that they can use polling information and the like to obtain an approximation to this stochastic model in a neighborhood of the initial candidate locations. For this reason we focus on LNE. (Halpern  gives some objections to the concept of Nash equilibrium, in terms of computability and the knowledge requirement of agents, and this provides some basis for our use of LNE. In the empirical work presented below, we find that LNE and PNE coincide.) Note, however, that as agents adjust position in response to information in search of equilibrium then the empirical model may become increasingly inaccurate.
A strict Nash equilibrium (PNE) for a model is a vector which globally strictly maximizes Obviously if is not an LNE then it cannot be a PNE.
It follows from  that, for the model the probability, that voter , with ideal point, picks at the vector, of agent positions is given by where Thus
We use this gradient equation in the form of MATLAB algorithms, given in Appendices A and B to obtain the LNE. This equation shows that the first-order condition for to be an LNE is given by Hence This can be written as where Here is the weighted electoral mean of agent Because this model is linear, it is possible to modify these weights to take account of the differential importance of voters in different constituencies. (For example, presidential candidates may attempt to maximize total electoral votes, so voters can be weighted by the relative electoral college seats of the state they reside in.) We can therefore write the first-order balance condition at an equilibrium, as a set of gradient balance conditions
The first term in this equation is the centripetal marginal electoral pull for agent defined at by
The second gradient term, is the centrifugal marginal activist pull for , at .
To determine the LNE for the model it is of course necessary to consider the Hessians . These will involve the second-order terms In the next section, we suggest that there will be natural conditions under which these will be negative definite. Indeed if the eigenvalues are negative and of sufficiently large modulus, then we may expect the existence of PNE.
For the pure spatial model, it is clear that when the agents adopt the same positions then is independent of the voter suffix, Thus all gives the first-order condition for an LNE. By a change of coordinates, it follows that is a candidate for an LNE. Note however that this argument does not follow for the model and generically
Since the valence functions are constant in the model the marginal effects, will be zero. However, since the weights in the weighted electoral mean for each agent will vary from one individual to another, it is necessary to simulate the model to determine the LNE . Notice also that the marginal vote effect, for a voter with will be close to zero. Thus in searching for LNE, each agent will seek voters with
The necessary and sufficient second-order condition for LNE at in the pure spatial model, is determined as follows. When all agents are at the electoral origin, and agent 1 is, by definition, the lowest valence agent, then the probability that a generic voter picks agent is given by: To compute the Hessian of agent 1, we proceed as follows: Here is the by identity matrix, and we use to denote a column vector. When all agents are at the same position, then is independent of Moreover, is the by covariance matrix of the distribution of voter ideal points, taken about the electoral origin. Thus the Hessian of the vote share function of agent at is given by Since this Hessian can be identified with the by characteristic matrix for agent 1, given by
Then the necessary and sufficient second-order condition for LNE at is that has negative eigenvalues. (For convenience we focus on a strict local equilibrium associated with negative eigenvalues of the Hessian.)
It follows from this that a necessary condition for to be an LNE is that the trace of the matrix is strictly negative. (For a weak LNE we require the trace to be nonpositive.) In turn this means that a convergence coefficient, , defined by satsfies the critical convergence condition, Here is the sum of the variance terms on all axes.
A sufficient condition for convergence to in the two-dimensional case is that
When the necessary condition fails, then the lowest valence agent has a best response that diverges from the origin. In this case there is no guarantee of existence of a PNE.
We can also consider a model where we use different coefficients on the axes, so the spatial component has the form Then the characteristic matrix can be taken to be where is the diagonal matrix of the coefficients, while is the covariance matrix where each axis is weighted by the coefficients . The necessary condition is thus that trace(
Because the model is linear, we can obtain a similar result where there a multiple electoral groups, each weighting the axes differently.
In the empirical analyses, we can used Newton's method with gradient information to compute best responses, in order to determine LNE in the various models.
3.1. Application to the Case with Multiple Activist Groups
We adapt the model presented by Schofield and Cataife in , where there are multiple activist groups for each party.
(i) For each agent, , let be a family of potential activists, where each is endowed with a utility function, which is a function of the position The resources allocated to by are denoted as . The total activist valence function for agent is the linear combination where are functions of the contributions and each is a concave function of .
(ii) Assume that the gradients of the valence functions for are given by where the coefficients are differentiable functions of
(iii) Under these assumptions, the first-order equation becomes
The Contract Curve generated by the family is the locus of points satisfying the gradient equation
The Balance Locus for agent defined by the family is the solution to the first-order gradient equation The simplest case, discussed in , is in two dimensions, where each agent has two supporting activist groups. In this case, the contract curve for each agent's supporters will, generically, be a one-dimensional arc. Miller and Schofield  also supposed that the activist utility functions were ellipsoidal, mirroring differing saliences on the two axes. As discussed earlier, in this case the contract curves would be catenaries, and the balance locus would be a one-dimensional arc. The balance solution for each agent naturally depends on the position(s) of opposed agent (s), and on the coefficients, as indicated above, of the various activists. The determination of the balance solution can be obtained by computing the vote share Hessian along the balance locus.
Since the activist valence function for agent depends on the resources contributed by the various activist groups to this agent, we may expect the marginal effect of these resources to exhibit diminishing returns. Thus the activist valence functions can be expected to be concave in the activist resources, so that the Hessian of the overall activist valence, can be expected to have negative eigenvalues. When the activist functions are sufficiently concave (in the sense that the Hessians have negative eigenvalues of sufficiently large modulus), then we may infer not only that the LNE will exist, but that they will be PNE.
If we associate the utilities with leaders of the activist groups for the agents, then the combination may be interpreted as the marginal utility of the candidate of party , induced by the activist support.
To see this, suppose that each agent were to maximize the function where is no longer an activist function, but a policy-determined component of the agent's utility function, while is the weight given to the policy preference. See  for such a model of policy-motivated agents. Then the first-order condition is almost precisely as we obtained above, namely
Here is a gradient pointing towards the policy preferred position of the agent. Thus we can make the identity and infer that agent's marginal policy preference can be identified with a combination of the marginal preferences of the party activists. In principle such a model could be used to determine optimal resource-raising strategies in an environment as complex as a presidential election.
4. Methodology: A Spatial Model of the 2008 Election
The 2008 American National Election Study (ANES) introduced many new questions on political issues in addition to the existing set. Assignment of respondents into the “new” or “old” set was random, with 1,059 respondents assigned to the “new” condition and having completed the followup post-election interview. Due to both Hispanic and African-American voter oversampling and followup attrition, the postelection weights are used for all analyses. As with all survey data, there was missing data for most of the survey items used in this study (varying from 0 to 8.6% by item). We used multiple imputation to correct for missing data.
The post-election interviews asked repondents whom they voted for, if at all. Since we use a conditional logit model, which requires data for both respondents (which we have) and candidates (which we only have for the major party candidates), we deleted 7 observations where respondents claimed to have voted for a presidential candidate other than McCain or Obama. The final sample size was thus 788 respondents.
To create the two-dimensional policy space, 29 survey items were selected to broadly represent the economic and social policy dimensions of American political ideology (see Appendix C for question wording). Some issues were overrepresented amongst these item, with seven questions about abortion, four for gay rights and policies concerning aid for African-Americans, and two about immigration issues. To avoid the policy space measure becoming dominated by these issues, with abortion a particular concern, separate scales were estimated for each of these policy areas, either using confirmatory factor analysis or a simple average in the case of the two immigration items. (see Tables 1, 2, and 3.) Finally, a confirmatory factor analysis was run using these four scales in conjunction with the remaining 12 survey items. Only two factors achieved eigenvalues greater than one. Each factor corresponded closely to a priori conceptualizations of economic and social policy, with the possible exceptions of the equality and gun access items, which loaded more strongly on the economic rather than social dimensions. (see Table 4 for factor loadings.) These factor scores were used as measures of individual locations on the policy space.
The ANES also includes questions on seven qualities or traits associated with Obama and McCain. Confirmatory factor analysis run on the 14 items produced a two-factor solution which corresponded perfectly with the named candidate. The resulting factor scores were used as estimates of voter perceptions of the candidate's personal traits. (see Table 5.)
Respondents were coded as activists if they claimed to have donated money to a candidate or party and nonactivists if they donated money to no candidate. Table 6 gives the descriptive data for activists and nonactivists.
The survey also gave data on whether the respondent was African-American, female, working class, from the South, as well as the number of years of education and level of income. These data were used to construct the sociodemographic models of voting.
To calculate the presidential candidate positions, we took advantage of new survey questions which asked respondents to locate the positions of Obama and McCain on seven distinct issues.
These seven questions (government spending, universal health care, citizenship for immigrants, abortion when nonfatal, abortion when gender incorrect, aid to blacks, and liberal-conservative) were otherwise worded the same as the corresponding items from the policy issue questions.
We ran two linear regression models on the voter economic policy and social policy factor scores using only the seven policy items corresponding in wording to the seven candidate location items as predictors. The estimated coefficients from these two linear models enabled us to construct equations to map the data from the candidate location questions onto the complete voter policy space. These equations were able to predict the scores of the voter policy space fairly accurately. The coefficient of determination () for the economic and social policy equations were 0.63 and 0.75, respectively. To find McCain's ideal point, we simply took the average response for each of his seven candidate location questions, entered these into the economic and social policy prediction equations, and used the corresponding predicted values. We then repeated the process using Obama's candidate location questions. See Table 7 for the estimated positions of the two-candidates.
Figure 2 previously gave a plot of the voter distribution, while Figure 3 shows the perspective plot of the voter distribution. The plots of the activist positions are shown in Figure 4. Finally Figure 5 gives a smoothed contour plot of the probability density function of the voter distribution (The outer contour line is at the 0.05 level, while Democrat activists are denoted in red, and Republicans in blue.).
Figure 5 also shows the estimated threshold dividing likely Democrat candidate voters from Republican candidate voters. This partisan cleavage line was derived from a binomial logit model, designed to test the effects of each policy dimension on vote choice. We call this the pure positional binomial logit model.
According to the positional model, a voter with preferred position is estimated to vote Republican with probability where
That is, any voter with preferred point lying on the cleavage line has equal probability of picking one or other of the candidates. This cleavage line is given by the equation This cleavage line misses the origin, and goes through the point , indicating the valence advantage of Obama. The coefficient is a measure of the negative relative valance of McCain with respect to Obama. This cleavage line is similar to those obtained by Schofield et al. .
The positional model does not allow us to estimate equilibria, so we now turn to the pure spatial model.
4.1. Computation of Equilibria for the US 2008 Election
As above, we first assume that the utility of voter for candidate is given by the pure spatial model
We assume that each candidate, chooses to locally strictly maximize the expected vote share subject to the position(s) of the other candidates. We essentially assume therefore that candidates cannot know precisely how voters choose but they can estimate the relationship between their own position, that of the competing candidate, and the aggregate vote total. As we shall see, the induced candidate preference correspondences are convex valued, indicating existence of Nash equilibria. The local pure strategy Nash equilibria (LNE) can be computed as follows.
The electoral covariance matrix for the sample is given by
The principal component of the electoral distribution is given by the vector with variance , while the minor component is given by the orthogonal eigenvector with variance .
All models in Table 8 are given with Obama as the base, so the results give the estimations of the probability of voting for McCain. The table also shows the loglikelihood, Akaike information criterion (AIC), and Bayesian information criterion (BIC) for the various models. Model () in Table 8 shows the coefficients for the -spatial conditional logit model in 2008 to be
These parameters are estimated when the candidates are located at the estimated positions. We assume that the parameters of the model remain close to these values as we modify the candidates positions in order to determine the equilibria of the model.
According to the model the probability that a voter chooses McCain, when the McCain and Obama positions are at the electoral origin, , is
The characteristic matrix (essentially the Hessian of McCain's vote function at is
The “convergence coefficient” is The sufficient condition for convergence to is that Thus our estimate for exceeds this critical value for convergence. However, the necessary condition is satisfied, and the determinant of is positive, while the trace is negative. Thus both of the eigenvalues of are negative, and the origin is a maximum of McCain's vote share function. The best response functions of the candidates are well behaved, so the LNE is a PNE.
We also considered a spatial mode where the two axes had different coefficients, estimated to be , . Again, the determinant was found to be positive and trace negative, so the origin is also a maximum of McCains vote share function for this model. Simulation of these models confirmed that the joint origin was an LNE.
We now turn to the models with traits and sociodemographics. Table 8 also gives the various spatial models with these additional valences.
Comparison of the loglikelihoods for the pure spatial model and the model with traits shows that the perception of character traits is important for the statistical significance of the model. (We use the Bayes' factors, or difference in loglikelihoods as a measure of statistical difference between two models .) For example, the spatial model with traits has a very large Bayes' factor of 114 over the pure traits model, while the spatial model with traits and sociodemographics has a Bayes' factor of 150 over the traits model.
Like the pure spatial model, the induced preference correspondences in the joint model with sociodemographic valences are all convex valued, indicating existence of a PNE. Simulation of the spatial model with sociodemographic valences showed that the PNE was one where both candidates adopt the origin. Although the sociodemographic valences add significance to the model, they do not affect the equilibrium positions. On the other hand, simulation of the full model with traits showed that the PNE was one where the candidates adopted the positions and .
This equilibrium is only a slight perturbation from the joint origin. We can infer that though the traits add to the statistical significance of the stochastic model they do not significantly affect the equilibrium. Figures 6, 7, and 8 show the relationship of the perception of Obama and McCain traits. Figure 6 shows there is a slight negative correlation between these perceptions, while Figures 7 and 8 suggest that there are correlations between perceptions of candidate traits and vote choice. These weak correlations have only a slight effect on the strong convergence induced by the electoral pull.
We can therefore write , since the joint model with traits has no activist valence terms. The argument of Section 3 implies that can be interpreted as the vector of “weighted electoral means” in a full model with activists. Assuming that the estimated candidate positions, are in equilibrium with respect to the activist model, then by the balance condition, as given above, we obtain: Here is the pair of direction gradients, induced by activist preferences, acting on the two-candidates. The difference between and thus provides an estimate of the activist pull on the two-candidates. In this election, we estimate that activists pull the two-candidates into opposed quadrants of the policy space. The estimated distributions of activist positions for the two parties, in these two opposed quadrants (as given in Figure 4), are compatible with this inference. The means of these activist positions are:
Miller and Schofield [16, 22] propose a model where activists have eccentric utility functions. If we assume that the Democrat activists tend to be more concerned with social policy and Republican activists with economic policy, then we have an explanation for the candidate shifts from the estimated equilibrium. Note in particular that the distribution of activist positions for the two parties looks very different from the voter positions. The latter is much more heavily concentrated near the electoral origin, while the former tends to be dispersed.
When the candidates are at their estimated positions, the estimated vote shares, according to the traits model, are Since the actual vote shares are it appears that the trait model may give a statistically plausible account for voter choice, but it does not provide, by itself, a good model of how candidates obtain votes. We suggest that the missing characteristic of this model of the election is due to the contributions of party activists.
Indeed, we suggest that the addition of activists to the model can account for the difference between convergent, equilibrium positions and divergent, estimated candidate positions, as obtained by Enelow and Hinich  and Poole and Rosenthal , respectively, in their various analyses of US elections.
The section on the formal model presented an extension where there are many activists for each candidate. This model suggests that the activist pulls on the two candidates will be particularly influenced by those activists who have more extreme policy preferences. This inference is corroborated by the above analysis, since it appears that the Democratic activists are more concerned with social policy, while the Republican activists are more concerned with economic policy.
Since the above equation is obtained from a first-order gradient condition, then as shown in Section 3, we could also interpret as the gradient obtained from a model where candidates have policy preferences derived from utility functions ( Duggan and Fey  have explored such a model for the case of a deterministic vote model, and obtained symmetry conditions for equilibrium. However, in such a model of policy seeking candidates, a candidate must be willing to adopt a losing position because of strong preferences for particular policies. In the activist model presented here, candidates act as though they have policy preferences, but these are induced from activist preferences, and are compatible with vote maximizing strategies by the candidates.
5. The Election in Britain in 1979
Figure 9 shows the estimated positions of the three major parties in Britain in 1979, as obtained by Quinn et al. , with the electoral distribution obtained from the survey data from Eurobarometer  and the party positions obtained from the middle level Elites Study . Tables 9(a) and 9(b) give the election results for five parties in Great Britain and five in Northern Ireland.
Using the pure spatial model as presented in Table 10 for just three parties in Great Britain, the coefficients are
When all parties are located at the origin, the model suggests that the Liberals would gain just under 10% of the vote. In fact, in 1979 they gained 13.8%. The model suggests that the divergence of the two major parties from the origin allowed the Liberals to gain a further 4% of the vote. Since the electoral variance is on the first (economic axis) and on the second axis, with negligible covariance . we obtain
Both eigenvalues are clearly negative.
The “convergence coefficient” is The pure spatial model of the 1979 election in Britain implies that the electoral joint origin is a vote share maximizing equilibrium. We next consider a joint multinomial conditional logit model, with the sociodemographic variables used by Quinn et al. . (These variables are denoted income, religion (relig),manual labor (manlab), size of town (stown) and education (educ), respectively, in Table 11.)
As Table 11 makes clear, only the group specific valence was statistically significant at the 1% level. The -spatial coefficient was also significant at the 1% level. The loglikelihoods of the joint and pure spatial models were very similar, and the Bayes' factor of the joint models over the pure spatial model was
According to the joint model, the weighted electoral mean for the Labor party should give greater weight to these voters who are manual laborers. Since these voters will tend to have preferred positions on the left of the economic dimension, we may infer that the Labor party activists will be positioned on the left of the economic dimension. However, the simulated LNE in the joint model was found to be the joint origin.
Thus the impacts of the sociodemographic valences on the simulated equilibrium are insignificant. Although these valences are useful in modeling the voting behavior of the electorate, they appear to have little significance on the policy positioning of the parties. If we assume that the party positions in Figure 9 are the LNE in the full activist model the, we obtain
As in the analysis of the United States, we find that the overall effect of the activist groups on the two major parties is to pull these parties apart. This leaves the Liberal Democrats in the center. With low valence, they only gain about 14% of the vote.
6. Concluding Remarks
Valence, whether intrinsic or based on electoral perceptions of character traits, is intended to model that component of voting which is determined by the judgments of the citizens. In this respect, the formal stochastic valence model provides a framework for interpreting Madison's argument in Federalist X over the nature of the choice of Chief Magistrate in the Republic. Schofield  has suggested that Madison's argument may well have been influenced by Condorcet's work on the so-called “Jury Theorem” . However, Madison's conclusion about the “probability of a fit choice” depended on assumption that electoral judgment would determine the political choice. The analysis presented here does indeed suggest that voters' judgments, as well as their policy preferences, strongly influence their political choice.
Condorcet's work has recently received renewed attention (McLennan ). This paper can be seen as a contribution to the development of a Madisonian conception of elections in representative democracies as methods of aggregation of both preferences and judgments. One inference from the work presented here does seem to belie Riker's arguments [33, 34] that there is no formal basis for populist democracy. Since voters' perceptions about candidate traits strongly influence their political decisions, the fundamental theoretical question is the manner by which these perceptions are formed. We argue that the low convergence coefficients in the majoritarian polities of the United States and Great Britain imply that the electorate is not polarized. Since candidates or party leaders do not adopt convergent positions, we can infer that democratic equilibria in these polities reflect the preferences of interest groups rather than the electorate at large.
On the other hand, empirical work on Israel  and Turkey  shows that the convergence coefficients in recent elections in these two polities are very large. (The estimates are 3.98 for Israel in 1996 and 5.4 for Turkey in 2002.) These estimates indicate that the polities in these countries are polarized. We can infer that parties in these polities diverge away from the electoral center, even in the absence of activism.
A. Matlab Optimization Algorithm for Best Response
See Algorithm 1.
B. Matlab Optimization Algorithm for Local Nash Equilibrium
See Algorithm 2.
C. Question Wording for the 2008 American National Election Study Survey Items
() Do you think the government should provide more services than it does now, fewer services than it does now, or about the same number of services as it does now?
() Do you favor, oppose, or neither favor nor oppose the US government paying for all necessary medical care for all Americans?
() A proposal has been made that would allow people to put a portion of their Social Security payroll taxes into personal retirement accounts that would be invested in stocks and bonds. Do you favor this idea, oppose it, or neither favor nor oppose it?
I am going to ask you three questions, and ask you to choose which of two statements in these questions comes closer to your own opinion.
() One, the main reason that government has become bigger over the years is because it has gotten involved in things that people should do for themselves. Two, government has become bigger because the problems we face have become bigger.
() One, we need a strong government to handle today's complex economic problems. Two, the free market can handle these problems without government being involved.
() One, the less government, the better. Two, there are more things that government should be doing.
() This country would be better if we worried less about how equal people are. Do you agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, or disagree strongly with this statement?
() Do you think that big companies should pay a larger percent of their profits in taxes than small businesses do, that big companies should pay a smaller percent of their profits in taxes than small businesses do, or that big companies and small businesses should pay the same percent of their profits in taxes?
() Should federal spending on welfare programs be increased, decreased, or kept about the same?
() Do you favor, oppose, or neither favor nor oppose the US government making it possible for illegal immigrants to become US citizens?
() Do you think the number of immigrants from foreign countries who are permitted to come to the United States to live should be increased a lot, increased a little, left the same as it is now, decreased a little, or decreased a lot?
() I would like to describe a series of circumstances in which a woman might want to have an abortion. For each one, please tell me whether you favor, oppose, or neither favor nor oppose it being legal for the woman to have an abortion in that circumstance. Staying pregnant would hurt the woman's health but is very unlikely to cause her to die.
() Staying pregnant could cause the woman to die.
() The pregnancy was caused by sex the woman chose to have with a blood relative.
() The pregnancy was caused by the woman being raped.
() The fetus will be born with a serious birth defect.
() Having the child would be extremely difficult for the woman financially.
() The child will not be the sex the woman wants it to be.
() Do you favor or oppose laws to protect homosexuals against job discrimination?
() Do you think homosexuals should be allowed to serve in the United States Armed Forces or don't you think so?
() Do you think gay or lesbian couples, in other words, homosexual couples, should be legally permitted to adopt children?
() Should same-sex couples be allowed to marry, or do you think they should not be allowed to marry?
() This country would have many fewer problems if there were more emphasis on traditional family ties. Do you agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, or disagree strongly with this statement?
() Do you think the federal government should make it more difficult for people to buy a gun than it is now, make it easier for people, or keep the rules the same?
() Some people feel that the government in Washington should make every effort to improve the social and economic position of blacks. Others feel that the government should not make any special effort to help blacks because they should help themselves. Where would you place yourself on this scale, or have not you thought much about this?
() Irish, Italians, Jewish, and many other minorities overcame prejudice and worked their way up. Blacks should do the same without any special favors. Do you agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, or disagree strongly with this statement?
() Generations of slavery and discrimination have created conditions that make it difficult for blacks to work their way out of the lower class. Do you agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, or disagree strongly with this statement?
() It is really a matter of some people not trying hard enough; if blacks would only try harder they could be just as well off as whites. Do you agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, or disagree strongly with this statement?
() We hear a lot of talk these days about liberals and conservatives. Where would you place yourself on a scale from liberal to conservative?
This paper is based on work supported by NSF Grant no. 0715929 and by a small grant from the Weidenbaum Center at Washington University. The first version was completed while Schofield was the Glenn Campbell and Rita Ricardo-Campbell National Fellow at the Hoover Institution, Stanford, 2009. Thanks are due to the very helpful comments of the anonymous referee.
H. Hotelling, “Stability in Competition,” Economic Journal, vol. 39, pp. 41–57, 1929.View at: Google Scholar
A. Downs, An Economic Theory of Democracy, Harper and Row, New York, NY, USA, 1957.
N. Schofield and I. Sened, Multiparty Democracy: Elections and Legislative Politics, Cambridge University Press, Cambridge, UK, 2006.
N. Schofield, C. Claassen, U. Ozdemir, and A. V. Zakharov, “Estimating the effects of activists in two-party and muti-party systems: a comparison of the United States and Israel,” Social Choice and Welfare. In press.View at: Google Scholar
N. Schofield, M. Gallego, U. Ozdemir, and A. V. Zakharov, “Competition for popular support: a valence model of elections in Turkey,” Social Choice and Welfare. In press.View at: Google Scholar
D. Stokes, “Spatial models and party competition,” American Political Science Review, vol. 57, pp. 368–377, 1963.View at: Google Scholar
D. Stokes, “Valence politics,” in Electoral Politics, D. Kavanagh, Ed., Clarendon Press, Oxford, UK, 1992.View at: Google Scholar
E. Penn, “A model of far-sighted voting,” American Journal of Political Science, vol. 53, pp. 36–54, 2009.View at: Google Scholar
J. H. Aldrich, “A Downsian spatial model with party activists,” American Political Science Review, vol. 77, pp. 974–990, 1983.View at: Google Scholar
J. Y. Halpern, “Beyond Nash equilibrium: a computer scientist looks at game theory,” Games and Economic Behavior, vol. 45, pp. 114–131, 2003.View at: Google Scholar
K. E. Train, Discrete Choice Methods with Simulation, Cambridge University Press, Cambridge, UK, 2003.View at: MathSciNet
G. Miller and N. Schofield, “Activists and Partisan Realignment in the U.S.,” American Political Science Review, vol. 97, pp. 245–260, 2003.View at: Google Scholar
R. Kass and A. Raftery, “Bayes factors,” Journal of the American Statistical Assocociation, vol. 90, pp. 773–795, 1995.View at: Google Scholar
J. M. Enelow and M. J. Hinich, “The location of american presidential candidates,” Mathematical and Computer Modelling, vol. 12, no. 4-5, pp. 417–435, 1989.View at: Google Scholar
K. Poole and H. Rosenthal, “U.S. presidential elections 1968–1980: a spatial analysis,” American Journal of Political Science, vol. 28, pp. 283–312, 1984.View at: Google Scholar
K. M. Quinn, A. D. Martin, and A. B. Whitford, “Voter choice in multi-party democracies: a test of competing theories and models,” American Journal of Political Science, vol. 43, no. 4, pp. 1231–1247, 1999.View at: Google Scholar
J. Rabier and R. Inglehart, Eurobarometer II April 1979. The Year of the Child in Europe, Inter-University Consortium for Political and Social Research, Ann Arbor, Mich, USA, 1981.
ISEIUM, “European Elections Study: European Political Parties' Middle Level Elites,” Europa Institut, Mannheim, Germany, 1983.View at: Google Scholar
N. Schofield, “Evolution of the constitution,” British Journal of Political Science, vol. 32, no. 1, pp. 1–20, 2002.View at: Google Scholar
N. Condorcet, Essai sur l'application de l'analyse a la probabilite des decisions rendus a la pluralite des voix, Imprimerie Royale, 1785.
A. McLennan, “Consequences of the condorcet jury theorem for beneficial information aggregation by rational agents,” American Political Science Review, vol. 92, no. 2, pp. 413–418, 1998.View at: Google Scholar
W. H. Riker, “Implications from the disequilibrium of majority rule for the study of institutions,” American Political Science Review, vol. 74, pp. 432–446, 1980.View at: Google Scholar
W. H. Riker, Liberalism against Populism, W.H. Freeman, New York, NY, USA, 1982.