RangeVoting.org - answer to puzzle 94: The "YN model"

Puzzle #94: The "YN model" – a simple voting model in which range voting behaves optimally while many competing voting systems can behave pessimally

Puzzle:
The "YN-model" is a simple model of voting (with a fair amount of realism) in which range voting performs optimally with "honest voters." There are m independent "issues" and 2^m "candidates"; each candidate announces a clear yes or no stance on each issue, and by some benificent miracle we have a full democratic marketplace, i.e, every possible issue-stance is represented. Hence (in the case m=4) we can name the 16 candidates YYYY ("yes" on all 4 issues), YYYN, YYNY,..., NNNN. Suppose, for each voter, the utility of electing any given candidate is just the number of issues on which that candidate agrees with the voter. Finally, we suppose without loss of generality that all m issues each would win if put to a single-issue yes/no vote, therefore the objectively best candidate is YYYY. The "without loss of generality" is because we could just rename the issues to make "yes" become "no" for that issue, e.g. the issue "raise taxes?" becomes "do not raise taxes," so with such canonical renaming we can always cause the best candidate to be named YYYY.

Prove that range voting, with "honest voters" (who always score their favorite m, their least-favorite 0, and everybody else linearly-interpolated between according to their utility) will always elect the best candidate in the YN model (i.e, here, YYYY).
But show that plurality voting can unambiguously elect the worst candidate (here NNNN) while causing the best candidate (YYYY) to have zero votes.
Hence Instant Runoff (and Coombs), in such an election, would fail to elect YYYY. Can IRV ever elect the worst candidate NNNN?
Show that Condorcet Cycles can happen in the YN model.
Show that all Condorcet methods (with honest voters) can elect the worst candidate NNNN and can mis-order the finishers exactly backwards so that candidates with more Ns always are majority-preferred over those with fewer!
Show that Borda voting (with honest voters) both can elect the worst candidate NNNN and order the candidates exactly backwards.
Show that Approval voting (with honest voters who approve the top F percent of the available candidates and disapprove the rest, where F is any fixed value with 0<F≤50%) can elect the worst candidate NNNN. (And range voting can also do this with strategic voting.)
(Partially open) What about Coombs voting? Can it elect the worst candidate NNNN?
What about Copeland voting? Can it elect the worst candidate NNNN?

Comment: Most analyses of voting system "properties" ignore the fact that "issues" exist and candidates have stances on them. The YN model is approximately the simplest and most natural possible voting model which doesn't ignore that. It turns out that honest range voting is always optimal in the YN model and is "self consistent." But most other commonly proposed voting methods turn out to be both "self contradictory" and non-optimal (indeed they can even behave pessimally) in the YN model.

Answers

a. With honest range voters, "Y" (on the first issue) can be regarded as getting a 1-vote or a 0-vote (for those voters who support or oppose issue #1), and ditto for each other issue, as a contribution to each candidate's range score. This view is mathematically equivalent thanks to our postulate the honest range voters linearly interpolate their range votes (between the best and the worst candidates getting the max and min range votes) based on utilities. Since we have postulated that Y beats N on each issue, the candidate YYYY beats every other candidate. In other words, range voting always delivers the best possible winner in the YN model.

b. Say 71% of the voters each have ideologies consisting of 71% Ys (all such ideologies equinumerous) while 29% have the NNN...NN ideology. Then 50.4%=71%² of the voters prefer Y on any given issue, so by majority vote, Y would win on each issue individually. But NNN..NN is the plurality-vote winner with 29% of the vote, while YYY..YY gets zero top-rank votes, and indeed every candidate besides NNN..NN gets (in the m→∞ limit) zero percent of the votes – sort of the ultimate in "vote splitting."

As a very concrete example of this kind of idea with m=4 (this election data was created artificially by S.J.Brams for the purpose of solving this problem), let the plurality votes for each candidate be as follows in a 31-voter, 4-issue, 16-candidate election

candidate	votes	candidate	votes
YYYY	0	YNYY	4
YYYN	4	YNYN	1
YYNY	4	YNNY	1
YYNN	1	YNNN	1
NYYY	4	NNYY	1
NYYN	1	NNYN	1
NYNY	1	NNNY	1
NYNN	1	NNNN	5

Then on each issue Y beats N by 16 to 15. (For example, just look at the first letter of each candidate's name to see Y on the first issue wins by 16:15 over N.) But nevertheless YYYY gets zero votes and NNNN wins a plurality election with 5 votes.

c. Instant runoff never elects a candidate with zero top-rank votes (since it always eliminates that candidate in the first round). Also Coombs eliminates the candidate (here YYYY) with the most bottom-rank votes, hence cannot elect YYYY since it eliminates it immediately. Unfortunately, IRV advocate Rob Richie once pronounced this an "advantage" of IRV. However, in the present example, it is clearly a disadvantage, because the clearly best candidate YYYY gets zero top-rank votes.

Who is elected in this election by IRV? In this example IRV experiences many difficulties with tied round results. If we pretend/assume that the votes somehow get transferred equally in all tied-directions (and start with the above example multiplied by 10), then we'd arrive, at an intermediate stage in the IRV process, at this situation

candidate	votes
NYYY	55
YNYY	55
YYNY	55
YYYN	55
NNNN	90

and then the tie would be broken randomly with the result that one of the four Y³ candidates would win.

More detail: Abd ul-Rahman Lomax manually constructed the full honest-vote rank-orderings in the Brams example (we assume each voter rank-orders the candidates in descending order of the number of issues on which they agree with that voter. There could be ties in which case Lomax broke them according to his personal whim):

#voters	Their honest vote
0	YYYY>YYYN>YYNY>YNYY>NYYY>YYNN>YNYN>YNNY>NYYN>NYNY>NNYY>YNNN>NYNN>NNYN>NNNY>NNNN
4	YYYN>YYYY>YYNN>YNYN>NYYN>YYNY>YNYY>YNNN>NYYY>NYNN>NNYN>YNNY>NYNY>NNYY>NNNN>NNNY
4	YYNY>YYNN>YYYY>YNNY>NYNY>YYYN>YNNN>YNYY>NYNN>NYYY>NNNY>YNYN>NYYN>NNNN>NNYY>NNYN
1	YYNN>YYNY>YYYN>YNNN>NYNN>YYYY>YNNY>YNYN>NYNY>NYYN>NNNN>YNYY>NYYY>NNNY>NNYN>NNYY
4	YNYY>YNYN>YNNY>YYYY>NNYY>YNNN>YYYN>YYNY>NNYN>NNNY>NYYY>YYNN>NNNN>NYYN>NYNY>NYNN
1	YNYN>YNYY>YNNN>YYYN>NNYN>YNNY>YYYY>YYNN>NNYY>NNNN>NYYN>YYNY>NNNY>NYYY>NYNN>NYNY
1	YNNY>YNNN>YNYY>YYNY>NNNY>YNYN>YYNN>YYYY>NNNN>NNYY>NYNY>YYYN>NNYN>NYNN>NYYY>NYYN
1	YNNN>YNNY>YNYN>YYNN>NNNN>YNYY>YYNY>YYYN>NNNY>NNYN>NYNN>YYYY>NNYY>NYNY>NYYN>NYYY
4	NYYY>NYYN>NYNY>NNYY>YYYY>NYNN>NNYN>NNNY>YYYN>YYNY>YNYY>NNNN>YYNN>YNYN>YNNY>YNNN
1	NYYN>NYYY>NYNN>NNYN>YYYN>NYNY>NNYY>NNNN>YYYY>YYNN>YNYN>NNNY>YYNY>YNYY>YNNN>YNNY
1	NYNY>NYNN>NYYY>NNNY>YYNY>NYYN>NNNN>NNYY>YYNN>YYYY>YNNY>NNYN>YYYN>YNNN>YNYY>YNYN
1	NYNN>NYNY>NYYN>NNNN>YYNN>NYYY>NNNY>NNYN>YYNY>YYYN>YNNN>NNYY>YYYY>YNNY>YNYN>YNYY
1	NNYY>NNYN>NNNY>NYYY>YNYY>NNNN>NYYN>NYNY>YNYN>YNNY>YYYY>NYNN>YNNN>YYYN>YYNY>YYNN
1	NNYN>NNYY>NNNN>NYYN>YNYN>NNNY>NYYY>NYNN>YNYY>YNNN>YYYN>NYNY>YNNY>YYYY>YYNN>YYNY
1	NNNY>NNNN>NNYY>NYNY>YNNY>NNYN>NYNN>NYYY>YNNN>YNYY>YYNY>NYYN>YNYN>YYNN>YYYY>YYYN
5	NNNN>NNNY>NNYN>NYNN>YNNN>NNYY>NYNY>NYYN>YNNY>YNYN>YYNN>NYYY>YNYY>YYNY>YYYN>YYYY

and then upon working through this election with Rob LeGrand's ranked-ballot calculator we confirm that YYNY indeed wins with IRV. NNYY won with Coombs' method. Also with honest Bucklin voting YYYY wins (although if the Bucklin voters vote "strategically," Lomax says NNNN wins). YYYY also wins with honest Borda and honest Condorcet voting using either Tideman ranked pairs, Schulze beatpaths, Nanson, Simpson-Kramer min-max, Copeland, and basic Condorcet.

It also is possible (in other situations) for IRV to elect the worst candidate NNN...NN. The idea in answer (e) will usually do that if m is large enough, and we throw in enough NNN...NN voters to make sure that NNN...NN does not get eliminated in an early round for having zero top-rank votes, and we randomly pre-perturb the vote-counts slightly to break ties. (And a fully explicit example with 83 voters will be given in answer "e" below.)

d. The YN model with m=6 can exhibit a Condorcet cycle among the three candidates named
YYNNNN, NNYYNN, and NNNNYY.
This will happen if there are three equinumerous types of voters whose ideologies are
YYYNNN, NNYYYN, and YNNNYY.

e. If 70% of the voters each have ideologies consisting of 70% Ns (all such ideologies equinumerous) while 30% have the YYY...YY ideology, then the worst candidate NNN...NN is majority-preferred over the best candidate YYY...YY. (Note 49%=70%² of the voters prefer N on any given issue, so by 51-49 majority vote, Y would win on each issue individually.)

At first I thought this meant NNN...NN was the Condorcet winner, but actually, that is not the case; NNN...NNY (with just a single Y) pairwise-beats NNN...NNN! Still, I think NNN...NNN wins with the Simpson-Kramer minmax Condorcet method? (And which others?)

As a very-concrete instantiation of approximately that idea (got by taking various rows from Lomax's table and changing the numbers), with these 67 voters

#voters	Their honest vote
24	YYYY>YYYN>YYNY>YNYY>NYYY>YYNN>YNYN>YNNY>NYYN>NYNY>NNYY>YNNN>NYNN>NNYN>NNNY>NNNN
10	YNNN>YNNY>YNYN>YYNN>NNNN>YNYY>YYNY>YYYN>NNNY>NNYN>NYNN>YYYY>NNYY>NYNY>NYYN>NYYY
10	NYNN>NYNY>NYYN>NNNN>YYNN>NYYY>NNNY>NNYN>YYNY>YYYN>YNNN>NNYY>YYYY>YNNY>YNYN>YNYY
10	NNYN>NNYY>NNNN>NYYN>YNYN>NNNY>NYYY>NYNN>YNYY>YNNN>YYYN>NYNY>YNNY>YYYY>YYNN>YYNY
10	NNNY>NNNN>NNYY>NYNY>YNNY>NNYN>NYNN>NYYY>YNNN>YNYY>YYNY>NYYN>YNYN>YYNN>YYYY>YYYN
3	NNNN>NNNY>NNYN>NYNN>YNNN>NNYY>NYNY>NYYN>YNNY>YNYN>YYNN>NYYY>YNYY>YYNY>YYYN>YYYY

we find that Y wins on issue #1 by 34-to-33 majority (and ditto for any other issue). But NNNN is the election winner using either Basic Condorcet, Schulze beatpaths, Tideman Ranked Pairs, Simpson-Kramer min-max, Nanson-Baldwin, or Bucklin, while NNNY wins with IRV and NYYY with Black and Borda. Coombs delivers the best winner YYYY in this case.

And with these 83 voters

#voters	Their honest vote
32	YYYY>YYYN>YYNY>YNYY>NYYY>YYNN>YNYN>YNNY>NYYN>NYNY>NNYY>YNNN>NYNN>NNYN>NNNY>NNNN
10	YNNN>YNNY>YNYN>YYNN>NNNN>YNYY>YYNY>YYYN>NNNY>NNYN>NYNN>YYYY>NNYY>NYNY>NYYN>NYYY
10	NYNN>NYNY>NYYN>NNNN>YYNN>NYYY>NNNY>NNYN>YYNY>YYYN>YNNN>NNYY>YYYY>YNNY>YNYN>YNYY
10	NNYN>NNYY>NNNN>NYYN>YNYN>NNNY>NYYY>NYNN>YNYY>YNNN>YYYN>NYNY>YNNY>YYYY>YYNN>YYNY
10	NNNY>NNNN>NNYY>NYNY>YNNY>NNYN>NYNN>NYYY>YNNN>YNYY>YYNY>NYYN>YNYN>YYNN>YYYY>YYYN
11	NNNN>NNNY>NNYN>NYNN>YNNN>NNYY>NYNY>NYYN>YNNY>YNYN>YYNN>NYYY>YNYY>YYNY>YYYN>YYYY

we find that Y wins on issue #1 by 42-to-41 majority (and ditto for any other issue). But NNNN is the election winner using either Basic Condorcet, Schulze beatpaths, Tideman Ranked Pairs, Nanson-Baldwin, Simpson-Kramer min-max, Bucklin, or IRV. (But NYYY wins with Black and Borda and YYYN wins with Copeland. Again Coombs elects YYYY.)

Later note: see answer i for a complete solution to the Condorcet-method problem, which works for every Condorcet method, not just the named-methods we mentioned above.

f. The same example that led off answer (e) also yields that NNN...NN is the Borda winner (if m is large enough). To be ultra-concrete about it, with m=4 take 21 YYYY voters versus 10 each for YNNN, NYNN, NNYN, NNNY (and use Borda score expectation value among random honest orderings of the candidates); then the unique Borda winner is NNNN even though any individual issue is won by Y by a 31-to-30 majority.

[Our more-powerful "backwards Condorcet" example in (i) will also yield a backwards Borda ordering because of the fact that a candidate's Borda score is the sum of his pairwise-victory margins, up to an overall additive constant.]

g. The same example in (e) also yields that NNN...NN is the Approval Voting winner (if m is large enough) for any fixed value of F with 0<F≤50%. Also, of course, the Brams m=4 example for plurality voting is also valid for approval voting if the voters only approve their top choice, i.e. the top 1/16 of the candidates. In that example Lomax also notes that voters approving only their favorite if he is a "frontrunner" but approving both their favorite and successive candidates worse than him until they reach a "frontrunner"... will sadly still elect NNNN if we regard "frontrunners" as those with 4 or more plurality votes. Further, in the 61-voter concrete m=4 example in answer "f", if voters approve the top 5 candidates, then NNNN wins with 40 approvals; no rival gets more than 30. Also we note that range voting can elect NNNN if the voters are strategic exaggerators rather than "honest," i.e. if we assume they vote in "approval style." Also, if voters have different strengths of feeling, e.g. if the NNNN voters were extra passionate (this differs from the utilities assumed in the problem statement), then range can elect NNNN, albeit in that case this might be "good behavior," not a "malfunction."

h. In the election example that began answer b, Coombs begins by eliminating YYY..YY because it has the most bottom rankings (29% of the voters rank it bottom, far more than for any other candidate). Coombs (assuming random tie-breaking in who-to-eliminate decisions) then continues by eliminating candidates with fraction-F of "Y"s in their names, where F gradually decreases from 100% down to 71% and continues down toward 0 (it gets arbitrarily near to 0 if m is sufficiently large). Eventually we reach a situation where only a few candidates remain, all of them with almost-all "N"s in their name, and one wins. This almost solves the problem because the winner will have an arbitrarily large fraction (below 100%) of "N"s in its name – but does not completely solve the problem because we have not proved it is NNN..NN itself. If m is sufficiently large, then NNN..NN will be ranked bottom – in a head-to-head 2-candidate race between NNN..NN and NNN..NNY with just a single Y, or any very small fraction of Ys, in the latter name – by at least 50.4% of the voters, hence will be eliminated. That proves NNN..NN cannot be the Coombs winner. So this election almost, but not quite, solves the problem.

Incidentally, in this election note that Coombs eliminates almost all candidates in exactly backwards order, i.e. starting by eliminating the best candidate YYY..YY, and thus "outputs a rank ordering" of the candidates which is almost exactly the reverse of the order that "should" be output!

i. Let 51% of the voters have ideologies which are 49% Y. The remaining 49% of the voters have ideologies which are 60% Y. (All such ideologies equinumerous.) Then Y beats N on every individual issue. Now compare a candidate with fraction-P of Ys in its name, versus a candidate with fraction-Q of Ys in its name, P<Q. The former candidate is preferred arbitrarily-near-unanimously within the 51%-bloc of voters, hence by an overall majority, if 0.49(1-P)+0.51P<0.49(1-Q)+0.51Q [since these formulae give the expected number of letters-changed between the voter and the candidate for a random voter in the 51% bloc, and the distribution becomes arbitrarily sharply peaked about its expectation value in the m→∞ limit] i.e. if P<Q. Therefore, a candidate with fewer Ys in its name, is always majority-preferred over one with more Ys in its name! [Actually, the same analysis also works for the 49%-voter-bloc too, but in reverse, so that we see that the majority preference here is always a 51-49 majority to within ε.] Hence NNN..NN is the Condorcet winner and there are no cycles. This proves that not only Copeland, but in fact every Condorcet voting method, will elect the worst candidate NNN..NN and will order the finishers in exactly the reverse of the "right" ordering. Indeed every voting method based purely on the "pairwise matrix" will get everything totally backwards.

Final comment. This answer is almost perfect for theoretical purposes, i.e. it almost totally settles the worst-case behavior of all the most-common voting system proposals in the YN model. However, because fairly large m values seem needed to make the examples work (?), and because we are talking about worst case behavior which is rare (?), these pathologies all may have little practical interest.

It therefore is of interest also to examine the YN model for random voters. That will be the subject of a different puzzle.

Return to puzzles

Return to main page