Warren D. Smith. Nov. 2015. PRELIMINARY, it would be better to have a more careful & complete examination of various problem-version's LP duals, and analysis of the integer programs which result when the ∑=W constraint is adjoined.
If, in an election, each district attempted to elect its MP using score voting, then the resulting parliament could be disproportional. I.e: each voter rates each candidate with a numerical score from (say) 0=bad to 9=good. That causes each candidate to acquire an average score. But if each district then just elected whoever had the highest average score, the resulting parliament could be "disproportional."
A possible solution to that problem is to demand that the parliament have a prescribed party composition (proportional to the number of voters who had awarded each party their top rating), and among all possible parliaments meeting the party-composition constraints and 1-seat-per-district constraint, finding the "best" one, i.e. the one maximizing the sum of the average scores of all its MPs. We'll show that this optimization problem is efficiently soluble using "linear programs" (LPs) of a well understood type arising in optimal "weighted bipartite graph matching."
When the "dual form" of our linear program is studied, we find that it is quite similar to – but not the same as – the "Fair Majority Voting" scheme proposed by Michel Balinski in 2008. The reason it is not the same is that Balinski made a severe design error. He, in our terminology, optimized the wrong thing. (Although actually, Balinski did not really realize he was optimizing anything.) The right thing to optimize must approximate perceived society-wide "utility." Balinski optimizes something rather unrelated to that, whereas we optimize the obvious natural approximation to it. But it is possible to alter the Balinski proposal to remove his "multiplicative weights" by replacing them with "additive offsets." When that is done, Balinski's altered system becomes essentially equivalent to ours via an LP-duality, aside from his preference for plurality-style "name-one-candidate" ballots as opposed to our preference for score voting style "rate-all-candidates" ballots.
Also, still more embarrassingly for Balinski, he contended his system would "eliminate gerrymandering" by removing all incentive for it. That claim was total garbage, and displays a massive lack of understanding by Balinski of what gerrymandering is like in the USA and why it happens.
Both our proposal and Balinski's enjoy the virtues of electing exactly one MP per district, and achieving proportional representation (PR) as measured by party-fractions.
However, they also both suffer the disadvantage that parties are woven into the very fabric of the voting system definition. A government based on either system would need to forbid, or at least be very unfriendly to, unaffiliated candidates; and it probably would be deemed necessary to enact laws forbidding MPs from switching parties (countries have enacted such laws since they felt otherwise their PR systems could be "gamed"). I greatly prefer systems based entirely on candidates with no reference to parties, and which work equally well regardless of who is in what party, or whether parties even exist.
Also, we demonstrate another massive disadvantage of both schemes (Balinski's method is afflicted more severely than ours, but both seem severely damaged): they are hugely susceptible to a simple organized form of strategic voting I call "targeted killing." I would forecast that any country trying to use either scheme to elect its parliament would be devastated by this problem.
I think these disadvantages render both schemes, especially Balinski's, unacceptable for the basic design of the government of a country.
(Also, more generally, any of a wide class of PR schemes involving explicit use of party labels as part of their definitions, might be vulnerable to an analogous problem, which should be a very scary thought for anybody designing such a system. Play it safe by refusing to involve party labels in your system. Or watch out.)
But there is a possible salvage. Do not use our scheme as the election method. Instead merely use it as a "top up" method to try to restore PR to an otherwise single-winner-per-district parliament elected by old fashioned independent single-district single-winner elections. (Note: Our LP formulation can also be used to find the optimum parliament with prescribed party-composition such that district k elects N_{k} MP's, where the integers N_{k}≥0 can be anything, not necessarily all 1 or all equal.) When used for this top-up purpose, our method becomes a far less attractive "target" for "killings." And then it actually might be a decent way to design a government, at least if you can stomach being force-fed political parties...
Purpose. To elect a W-seat parliament, consisting of one MP per district, in such a way that the party-composition of the parliament, mirrors that of the voters. We assume some number C>W of candidates run and at least one candidate runs in each district.
Extension: we could optimally make district k elect R_{k} MPs, for any pre-specified set of non-negative integers R_{1}, R_{2}, ... summing to W; they do not need to all equal 1. Indeed, if desired we could even just pre-specify integer intervals [L_{k}, R_{k}] and merely demand district k elect a number x of MPs where L_{k}≤x≤R_{k}.
1. Each voter rates each candidate in her district, on an 0-to-9 numerical scale. (Voters also allowed to leave some candidates unrated, if that voter wishes not to express any opinion about that candidate.) Refinement suggested by J.Quinn: "Any voter who did not explicitly rate any candidates at 0 is considered to have given 0 to any candidates they left blank. Any voter who did rate at least one candidate explicitly at 0 is considered to have expressed 'no opinion' about any candidates they left blank. (I think this rule is the safest way to interpret 'intent of the voter'.)"
2. At this point, each candidate in each district possesses some "average score." Also, each political party, nationwide, has a number of "points." (Each time a voter rates K candidates co-equal top, then each of their political parties gets 1/K of a point.)
3. Determine the number of seats each party "deserves" in parliament,. e.g. party j deserves N_{j} seats, for some set of non-negative integers S_{1}, S_{2}, ... summing to W. These S_{j} could be produced from the party point-counts and W by, for example, the classical procedure of Sainte-Laguë. Extension: everything to follow could also more generally be done with integer intervals [S_{j}, T_{j}] where we shall demand that party j gets a number x of seats with S_{j}≤x≤T_{j}.
4. Find, from among all possible ways to elect W of the candidates to parliament which meet all the party seat-count constraints and also simultaneously meet all the district MP-count constraints, the "best" possible choice, and elect those W. Here "best" means that the average scores possessed by the W elected MPs, have maximum possible sum. (Note: if no parliament meeting all the constraints exists, then report that failure and stop. However, if the S_{j} were produced via Sainte-Laguë as described in step 2, then a solution obviously will exist.)
We can view this as a problem about bipartite graphs (or bipartite multigraphs) with edge weights, then solve it efficiently via linear programming. Specifically, consider a bipartite graph with one red vertex for each political party, and one blue vertex for each district. The edge from party j to district k has numerical weight equal to the average score possessed by the candidate from party j who ran in district k. (If some party runs more than one candidate in some district, we'd need to make multiple edges between those two vertices, and it'd be a bipartite multigraph. That is ok.)
Our optimization problem then is a "max-weight constrained-valence subgraph" problem. That is, we seek a subgraph (W-element subset of the edges) of the original graph. This subgraph must obey the "valence constraints" that, e.g. the number ν of edges emerging from the party-j red vertex, must obey S_{j}≤ν≤T_{j}. And so on; each vertex of the original graph has its own allowed-valency integer interval. Among all possible subgraphs meeting the valency constraints, our task is to find the one maximizing the sum of the edge-weights.
This graph problem has been heavily studied since the 1950s and consequently nowadays is well understood. One way to solve it is by reformulating it as a linear program. That works to provide a polynomial-time algorithm. (However, it also is possible to solve it more quickly because this linear program happens to have special structure. We won't discuss that.)
Here is the linear program. If there are P parties and D districts, then there are PD variables x_{jk}. (We are formulating this, for simplicity of notation, in the ordinary-graph case. Actually, in the multigraph case we really would need to index the x's by which edge of the original multigraph that x pertained to. In the ordinary graph case, an edge is specified by its two endpoints, namely party j and district k, which simplifies notation.) The meaning of x_{jk} will be the number of candidates elected by party j from district k, and this number after solution will always be either 0 or 1. These obey the linear inequality constraints
Subject to those constraints, we find the x_{jk} that maximize
where A_{jk} is the average score possessed by candidate jk.
Obviously, this is a linear program, and if its optimum solution happens to have all entries x_{jk} integer, then that solution instantly solves our original graph problem optimally. However, if we ignore the parenthesized ∑_{1≤j≤P}∑_{1≤k≤D} x_{jk}=W constraint, then it automatically will have all entries integer, because then this is a "totally unimodular" linear program with "integer right hand sides," due to the Hoffman-Gale theorem.
Non-integer solutions would be inconvenient because politicians do not like being cut in half. Hence for those problem versions we could end up having some other (i.e. not W) integer number of seats consistent with all the other (nonparenthesized) constraints. However in the unextended (or less extended) forms of our voting problem, the W-seat demand will be satisfied automatically, even without explicitly imposing it, just because, e.g, the sum of the demanded district-occupancies equals W if all M_{k}=N_{k}.
When we apply the linear programming duality theorem, we find our LP is equivalent to the following. (We'll assume S_{j}=T_{j} and M_{k}=N_{k} for this purpose.) Seek "additive adjustments" U_{j}≥0 for party j and V_{k}≥0 for district k such that
We then elect the N_{k} candidates in district k having
the maximum adjusted average score
Balinski's system begins by soliciting from each voter in each district a "name one candidate" plurality-style vote, saying their presumably-favorite candidate running within their district. Each such vote for a candidate also counts as a vote for his party. Balinski then determines each party's "deserved" seat count, by applying d'Hondt's method to the party vote counts. (Balinski prefers d'Hondt, while I prefer Sainte-Laguë, for this purpose. It perhaps is of interest that Germany's Bundestag originally used d'Hondt, but as of 2005 has switched to Sainte-Laguë.)
Then "multiplicative adjustments" ρ_{j}>0 for party j and λ_{k}>0 for district k, are found, such that if the vote counts are adjusted – i.e. any vote cast within district k is multiplied by λ_{k} and any vote cast for a party-j candidate is multiplied by ρ_{j} – then the winners in each district using adjusted votes, yield the deserved number of seats for each party.
So Balinski's system is the same as ours, except for the following changes:
While I believe all three of these choices by Balinski were unwise, the last one was truly absurd, because...
We optimize the right thing, namely, the sum, over all MPs, of their average score reckoned by voters. This is the obvious naive approximation to the "quality" or "social utility" of the parliament, as estimated by voters. Balinski maximizes the sum, over all MPs, of the logarithm of their number of votes. (Equivalently, he maximizes the product of their vote-counts.) This has little relation to anything utilitarian.
With Balinski, the NaziLoon party (which gets 1% of the votes and hence deserves 1% of the seats) necessarily will enjoy a multiplicative scale-up factor of about 50, as compared with the Mainstream party (which gets 50% of the votes) with a multiplier of about 1. In other words, Balinski is going to force each NaziLoon vote to count about the same as 50 Mainstream votes.
This is a huge "leverage" or "amplification" factor subject to easy exploitation via "strategic voting." For example, suppose for the purposes of argument that Justin Trudeau (leader of Canada's Liberal party, recently elected prime minister in 2015) is the most popular MP in Canada. So the Tories might like it considerably if Trudeau's district were to un-elect him and thus remove him from power. Tory leadership can easily accomplish that goal. They simply tell the Tories in Trudeau's district "we suggest that instead of voting Tory, which is a lost cause, you vote NaziLoon." The 50× amplification factor then easily carries the day, and – voila – Trudeau is gone.
I call this "targeted killing." Now the Tories might now want to try that again, selecting as their target some other important Liberal MP. If, however, they kept re-using the NaziLoon party as their pawn in the targeted-killing game, it would no longer work as well. So instead, for their next "target" the Tories would switch to using the BeerLovers party as the (new) pawn. And so on.
This would cause all the important leaders in Canada to be "assassinated" – removed from political power – and instead a ragtag random collection of normally unelectable NaziLoons, BeerLovers, etc would be all over parliament. So perhaps the best two-word description of the situation that would be caused if Balinski's voting system were the design for Canada, would be "complete devastation."
With our system, which has additive rather than multiplicative adjustments to votes, the situation is somewhat less dire than with Balinski. There will no longer be a 50× amplification factor for NaziLoon votes which could be used to "kill" any "target." Instead, the NaziLoons will get a large additive "head start" in vote count in each district. Because Trudeau (in our example above) was postulated to be a very popular Liberal MP, it would still take a lot of NaziLoon votes to overcome him in his district, and it would be far easier and much more likely to work, to target an unpopular Liberal MP in this way. So it would no longer be possible to target essentially anybody. Only some targets, usually comparatively unattractive ones, would be easily "hittable."
Nevertheless, even with our system there would be some easily predictable places where easily-determined strategic voting could be used to hit hard. These hits would be highly effective and also would, as a side effect, distort PR to artificially enrich parliament with fringe candidates.
Balinski (judging by his title) seems to hold the view that with his voting method, gerrymandering will vanish because it is no longer useful. From then on, districts would be drawn sensibly, for a refreshing change!
Unfortunately, we believe Balinski is wrong about that. We think it quite likely, unfortunately, that gerrymandering would persist and continue to corrupt the USA massively.
Nationwide, we get proportionality, therefore it does not pay to gerrymander. Agreed. But locally, gerrymandering pays – plenty!
Suppose you are Joe Democrat Congressman. You'd like to have a safe seat. So you get your buddies in the statehouse to draw you a safe district full of Democrats.
Result (under Balinski voting): Joe is almost certain to be elected. The only way Joe can fail to be elected is if the entire country decides it hates Democrats hugely, and/or Joe's district stops voting Democratic – both of which are almost impossible. (Well, if we ignore the above about "targeted killings.")
So what I predict would happen (with Balinski voting) is this. The Democrats and Republicans will make interparty deals (just as they already do) to draw "safe" districts for all their high-ranking members. As usual, none of them will be defeatable.
But there will also be some districts occupied by lower-ranking Dems and Repubs, who do not have as much influence with the party bosses, which will be "at play." Those will be unsafe seats. So Balinski might improve things, but by no means will eliminate either gerrymandering or "congressmen for life."
Because of proportionality, the Greens will be able to get elected too. There could in principle actually be districts that are safe for Greens (i.e. essentially guaranteed to elect a Green). That is because the Greens will (by proportionality) win, say 5% (or whatever) of congress. The most-Green district in the country will then be essentially guaranteed to elect a Green, even though it (say) contains only 10% Greens.
Ivan Ryan comments: Locals might be pissed at a third-party which sets up its base in their district so it can consistently get 5% of the vote – the result being that 95% of the residents of the district are represented by a wacko-fringe party that they don't support.
Now the Dems and Repubs could (and probably will?) conspire to hurt the Greens as follows. Although the Dems and Repubs will be unable to prevent 10% Greens from being elected if the voters are 10% Green, they can keep shifting the Green district-boundaries with the goal of preventing any Green from being reelected. Or, they could try to make all the districts safe for everybody, so nobody ever gets defeated. This won't entirely work (if the country shifts from 60-40 R/D to 40-60, say, then there will be a 20% change) but it will work in the sense that all the established party members will be safe everywhere. Further, if, say, the Dems arrange to make their seatholders all have districts packed with Democratic voters, we don't see how that will hurt the Democratic party:
So, sorry Balinski – there is huge incentive to continue to gerrymander. Indeed, the ultimate limit would be that every district is 100% politically monolithic. In that case, yes we'd get proportionality, but every congressman would be a congressman for life (the opposite of democracy) and the country would be totally divided+polarized into utterly predictable red and blue chunks.
Indeed, Balinski throughout his paper about "eliminating gerrymandering in the USA," appeared to be completely unfamiliar with the fact that a large number (probably "most") of the seats in the USA are "bipartisan gerrymandered" and indeed unaware even of the existence of the concept of "bipartisan gerrymandering." And also, with the fact the USA's districts are drawn, not federally, but rather locally (i.e. state by state). For next time, I would recommend Balinski acquaint himself with the basics about an area before he revolutionizes it.
The idea would be: first to elect 1 MP per district in an old-fashioned (non-PR) manner, i.e. with single-winner elections within each district independently. (Preferably the score voting single-winner voting system would be used, not "first past the post.") Then we would add "top-up" MPs to try to bring the parliament back toward proportionality. (There would be some pre-specified number W of seats in parliament, of which W-T were single-MP-in-district seats and the remaining T were top-up seats.) The number of top-up seats from each party could be obtained using Sainte-Laguë based on each party's vote counts, but now starting the Sainte-Laguë seat-addition procedure not from zero initial seat-counts for each party, but rather from the party seat-counts arising from the initial W-T "one MP per district" elections.
Sainte-Laguë seat-addition procedure: Let V_{j} denote the vote-count for party j, and S_{j} denote its current seat count. Find the party with the greatest V/(2S+1) value. Give that party one additional seat (incrementing its S). Keep doing this, one seat at a time, until all the seats in parliament have been labeled with party-occupancy labels. [d'Hondt is the same except that the formula V/(2S+1) is replaced by V/(S+1).]
With each party's deserved-seat count known for the top-up seats, the plan then would be to employ the linear program here to choose the T top-up MPs optimally while meeting the party-occupancy demands. This task would, in fact, be trivially simple, i.e, no linear programming would be needed! That it because we simply would, for a party deserving S more seats, simply elect the S highest-scoring as-yet-unelected MPs from that party. This essentially would be the (already well known and already used, e.g. in Sweden) Sainte-Laguë method of "party lists" but used as a top-up method, with the list ordering within each party simply being descending order by average scores received by their candidates from the voters in their districts.
If we wanted to impose the additional demand that no district could have more than one top-up MP, though, then we would need linear programming, and indeed integer programming. The latter might be too computationally difficult for feasibility. Linear programming is computationally efficient. Integer programming is not and too-large integer programs can be infeasible to solve. (Anyhow quite likely such an integer-programming-based top up would not even be desirable, if the "targeted killing" threat were considered worse than the problem of occasionally having districts with several MPs.)
Here is a list of advantages and disadvantages of the resulting system:
Michel Balinski: Fair Majority Voting (or how to eliminate gerrymandering), American Mathematical Monthly 115,2 (Feb. 2008) 97-113.
Steven J. Brams & Peter C. Fishburn: Proportional representation in variable-size legislatures, Social Choice and Welfare 1,3 (Oct. 1984) 211-229.
George B. Dantzig: Linear programming and extensions, Princeton Univ. Press 1963.
A.J.Hoffman & David Gale: Appendix within the paper I.Heller & C.B.Tompkins: An Extension of a Theorem of Dantzig's, pp.247-254 in H.W.Kuhn & A.W.Tucker eds. Linear Inequalities and Related Systems, Annals of Mathematics Studies #38, Princeton University Press 1956.