Puzzle #95: The "random YN model"

Puzzle:
The "YN-model" answers to the previous puzzle have been attacked as not very relevant to real life. That is, the election examples constructed to demonstrate that, e.g. Plurality voting can elect the worst winner NNNN, are (speculate the critics) unlikely to arise. To address that, now consider random voters. That is, every possible issue-stance (from among the 2m) is equally likely and we pick some large number V (far exceeding 2m) of such random voters. We than canonize them by reversing the signs of some of the issues so that for each issue considered alone, "Y" is the majority winner. (This canonical-renaming is purely for convenience and has no genuine effect.)

  1. "Honest" range voting (defined in previous puzzle) always elects the best candidate YYY...YY.
  2. Show that, with honest plurality voting, a "bad" candidate (over 50% "N"s in his name) will be elected with chance greater than some absolute positive constant C. (And what is C?)
  3. Show that, with honest approval voting where voters approve the top F-fraction of the candidates (for any fixed F with 0≤F<50%) a "bad" candidate (over 50% "N"s in his name) will be elected with chance greater than C.
  4. Show the same thing for Borda voting.
  5. STILL OPEN: What about with Condorcet, Coombs, and IRV voting?

Answers

Warning: We only give sketchy answers, which really should be redone more rigorously.

a. Same answer as preceding puzzle.

b. Two useful lemmas:
(i) each issue for each voter (after the canonical renaming) can be regarded as a slightly Y-biased coin toss and all issues involve independent coins. The bias is slight because we have postulated that the number V of voters far exceeds 2m; in this limit the bias tends to zero. Specifically each bias (as a number between 0 and ½) typically is proportional to (2m/V)1/2.
(ii) using extreme-value statistics [E.J.Gumbel: Statistics of Extremes. Columbia University Press 1958], the gap between the most-popular and second-most popular vote-counts (which typically has order 2-3m/2V1/2 or less, times log terms) is almost certainly tiny when m→∞ compared to the minimum margin (of order V1/2/m typically) by which each issue wins over its negation.

So consider the most numerous kind of voter before the canonical renaming. It is almost irrelevant what it is in the sense that if enough of that voter-type were obliterated to make it become second-most popular, then the same canonical renaming would happen, with probability→1 in the large-V/2m and large-m limits. Indeed, you could even obliterate votes for the winner while adding votes for any desired loser L – enough to make L win – and the canonization would still be (almost certainly) unaffected. Hence in this m→∞ limit C→1/2. Therefore for m finite, 0<C≤1/2.

c. With approval voting, with any fixed F with 0≤F<50%, the fraction of candidates approved by any given voter is exponentially(m) tiny and there are exponentially(m) many candidates no two of whom are approved by the same voter. To see the former, simply use the properties of the binomial distribution. To see the latter, use a "binary error correcting code with wordlength m bits and Hamming distance≥Fm" as the set of candidates; then standard existence results, such as the "Gilbert-Varshamov bound," for such codes show that "linear" ones exist which have exponential(m) large cardinality. Then the same argument we just gave for plurality voting will show that who the most-approved among the codeword-candidates is, is almost irrelevant; even if enough of its approvers were obliterated to bring it into second place (or even if we made an arbitrary codeword-candidate win), the canonization would almost certainly not change.

Now to consider the full set of all 2m candidates, note the above argument was also true for any "coset" of that linear code. There are an exponential(m) number of such cosets but even for the one containing the overall approval-vote winner, the argument still works even if weakened by any log-of-exponential (i.e. polynomial in m) factor, which is the sort of factor that happens in extreme-value statistics.

d. With Borda, the typical winning margin (in Borda score) between the top and second candidates T and S, will be of order 2-m/2V1/2 which again is almost certainly tiny when m→∞ compared to the minimum margin (of order V1/2/m typically) by which each issue wins over its negation. So if we add just enough random voters who each prefer S>T to change the Borda winner, that almost certainly will not be enough to change any issue-majority and hence the canonization will remain unchanged. (Indeed, even if "S" is not the second-placer, but in fact an arbitrary candidate, this should still be true with probability→1. Each random S>T vote we add alters the Borda margin by an additive amount of order 2m.) In other words, the identity of the Borda winner becomes independent of the issue-majorities in the large-m limit and hence a ≥50% "bad" winner again should be elected with probability→1/2.

e. I do not know the answer. Computer simulations should shed light. I conjecture that all these voting methods will elect "bad" candidates with probability≥C (for some positive constant C) in the large-V followed by large-m limit.

One can argue (again using the same technique) that the pairwise victory (or defeat) of candidate A versus B again becomes independent of the issue majorities in the large-m limit. This leads me to conjecture that in this limit (with high probability) no Condorcet "beats all" winner will exist. In that case the question, for any given Condorcet voting method, would rest on the behavior of its "fallback" procedure employed when no beats-all winner exists. I conjecture the Simpson-Kramer "min-max" procedure and Black (Borda as fallback) procedures both should yield winners independent of the issue-majorities in the large-m limit.


Return to puzzles

Return to main page