By Warren D. Smith Jan 2007
The same difficulty arises in two different guises in democracy:
In both cases, the problem is that you cannot have 3.7 congressmen – it is necessary to round to integers. Actually, by giving congressmen unequal vote-weights in the legislature, "3.7 congressmen" would be possible in some sense. But we shall ignore that.
Let P_{k} be the (known) population of state k, and let S_{k} be its number of congressmen, where the total number S=∑_{k}S_{k} of congressmen is fixed by US law to be S=435.
The question is: what should the S_{k} be?
Let P=∑_{k} P_{k} be the total population.
Start with S_{k} = ⌊P_{k} · S/P⌋
(rounding every ratio down).
Then if the total number of seats is too small, start awarding extra seats
to the states pre-sorted in decreasing order by rounded-off fractional parts, until
the correct total number of seats is achieved.
Hamilton's method has a very appealing "quota property":
Each state gets either its "lower quota"
The USA abandoned the Hamilton/Vinton method after the "Alabama paradox" arose. That is: With fixed population proportions, Hamilton's method can cause a state to lose a seat as a consequence of an increase in the total number of seats. I.e. it violates "house size monotonicity." While that may not matter now that the US House has its size fixed at 435 seats... Hamilton also violates population-pair monotonicity, i.e. it can cause state A to lose seats relative to state B even though A's population expands and B's contracts! Here are some examples of that:
State | Population | #seats(4) | #seats(5) |
---|---|---|---|
A | 5 | 2 | 3 |
B | 3 | 1 | 2 |
C | 1 | 1 | 0 |
The Alabama paradox was discovered in 1880, when it was found that increasing the total number of seats would decrease Alabama's number from 8 to 7. It is expected to occur about once every 8 apportionments in a model involving exponentially distributed state populations.
State | Population(I) | #seats(I) | Population(II) | #seats(II) | Pop.Change% |
---|---|---|---|---|---|
A | 554 | 5 | 566 | 6 | +2.2% |
B | 290 | 3 | 270 | 3 | -6.9% |
C | 156 | 2 | 164 | 1 | +5.1% |
total | 1000 | 10 | 1000 | 10 | 0% |
Balinski & Young [Fair Representation: Meeting the Ideal of One Person, One Vote (2nd edition), Brookings Institution Press 2001] pointed out that, in the year 1900, Hamilton/Vinton gave Virginia 10 seats and Maine 3, whereas in 1901, Virginia would have gotten 9 seats and Maine 4, despite the fact that the Virginia/Maine population ratio was 2.67 in 1900 and 2.68 in 1901. [1901 populations based on linear interpolation between 1900 and 1910].
House monotonicity: if population of whole country goes up (preserving proportions) no state can lose seats. [No longer matters much for US congress since total #seats now fixed at 435.]
Population-pair monotonicity: If population of state A increases but state B decreases, then A should not lose seats while B stays the same or gains seats. More generally, if A's percentage population change exceeds B's, then A should not lose seats while B stays the same or gains seats.
Quota: Always should give a state either A or A+1 seats (where the ideal ratio lies between).
Theorem (Balinski & Young): All "divisor methods" (and, essentially, only divisor methods) are both House and population-pair monotone; but they all disobey quota. (However, these disobeyals can be rare. Webster, Dean, and the Huntington-Hill method have never violated quota on any US states dataset so far, and according to some vaguely-described computer experiments by B&Y would be expected to do so only with probability 0.00061, 0.00154, and 0.000286 respectively, which in the case of Webster would be once every 1600 apportionments, i.e. once every 16000 years. My own computer experiments agree Webster is by far the least likely of the 5 classic divisor methods to suffer a quota violation, and also agrees Hill and Dean are in second and third places respectively, but estimates the quota-violation probability for Webster as 0.0016, which far exceeds B&Y's estimate but still is 6300 years between violations. Meanwhile Hamilton exhibits a monotonicity paradox about once every 18 apportionments in B&Y's same model. B&Y also showed that divisor methods avoid paradoxical results when new states are added.) Webster also satisfies quota if there are ≤3 states (but the others among the classic five divisor methods only do so if there are ≤2 states).
Meanwhile, Hamilton satisfies quota but disobeys both monotonicity properties. That leads to the question of whether an apportionment method exists that satisfies all three properties. The answer is "no" – the last two properties are incompatible – but "yes" if we only ask for the first and third properties (new method found by Balinski & Young in 1974).
Here's Balinski & Young's 1974 apportionment method which is both house-monotone and obeys quota: Let P_{k} denote the population, and S_{k} the number of seats assigned so far, to state k. We initially assign all states 0 seats: S_{k}=0. We now repeatedly award a new seat to whichever state maximizes P_{k}/(1+S_{k}) subject to the constraint that adding this new seat not violate the "upper quota" condition, i.e. seats are not awarded that cause S_{k}≤⌈P_{k}∑S_{k}/∑P_{k}⌉ to be violated. We just keep awarding new seats in this way, never violating quota (which B&Y proved trivially can always be done), until the correct total number ∑S_{k} results. Although Balinski & Young initially recommended this method, ultimately they rejected their own invention because it fails population-pair monotonicity. Also G.D.Birkhoff in 1976, and Cyril Carter in 1982, both claimed based on empirical numerical experiments, that this method was "biased" in favor of large-population states. (Birkhoff and Carter's claims were not very convincing because the number of their experiments were very small.) Jonathan W. Still [SIAM J. Applied Math. 37,2 (1979) 401-418] then found an infinite class – in fact the entire class – of house-monotonic and quota-obeying apportionment methods (Balinski & Young's method is just one member of Still's class). All of Still's methods are of the following form:
Why is it impossible for any method to satisfy both quota and population-pair monotonicity? Consider the following 18-seat apportionment problem in which both Webster and Huntington-Hill give the same seat-counts – but both violate quota (violation shown in red) – actually every "divisor method" violates quota in this scenario:
State | Population | Quota | #seats(Webster&Hill, use divisor=8.43) |
---|---|---|---|
A | 13 | 1.625 | 2 |
B | 13 | 1.625 | 2 |
C | 13 | 1.625 | 2 |
D | 105 | 13.125 | 12 |
Total | 144 | 18 | 18 |
Now consider any method which tries to satisfy quota by means of A, B, or C giving up a seat to D. We claim that whichever one you pick (since they are all the same), you can use it to construct a population-pair paradox: say you picked A. Now move the people around so as to decrease A's and D's populations and increase B's and C's (with A having the largest decrease, percentagewise). Then the obvious apportionment will yield a population "paradox" since D would give up a seat to A despite A's faster population decrease. To exhibit that in detail:
State | Population | Quota | #seats(Any quota-obeying method) |
---|---|---|---|
A | 130 | 1.625 | 1 |
B | 130 | 1.625 | 2 |
C | 130 | 1.625 | 2 |
D | 1050 | 13.125 | 13 |
Total | 1440 | 18 | 18 |
and then after 60 people move (decreasing D's population by 5.0% and A's by 5.4%):
State | Population | Quota | #seats(Hamilton, Hill, Webster, Dean, Adams) |
---|---|---|---|
A | 123 | 1.5375 | 2 |
B | 160 | 2 | 2 |
C | 160 | 2 | 2 |
D | 997 | 12.4625 | 12 |
Total | 1440 | 18 | 18 |
...which is a population-pair paradox in the sense that D lost a seat to A despite A's largest-percentage population decrease. This does not actually prove that every quota-obeying method must exhibit such a paradox – for that see theorem 6.1 of B&Y's book – but it at least makes it plausible.
Splitting & combining incentives:
Jefferson's method (which is equivalent to d'Hondt's method)
"encourages mergers" because by merging into one,
two states can gain seats but
never lose them.
Theorem (see Balinski & Young thms 9.1 & 9.2)
Jefferson is indeed the unique population-pair-monotonic
apportionment method
that offers such a combining-incentive (and as a free bonus, Jefferson also always obeys
the "lower quota" condition).
(B&Y proposition 6.4):
Adams is exactly the opposite: with Adams, states can gain seats by splitting,
but never lose seats (except once the number of states exceed the number of seats,
Adams becomes undefined), and it is the unique population-pair monotone method
that thus-offers splitting-incentive (and as a free bonus, Adams
also always obeys the "upper quota" condition).
Bias: Vaguely speaking, if a method "tends to favor small-population states" (or large-population ones) it is "biased." There are, however, many different possible formulas for assessing such "bias" and "favortism."
Theorem: Hamilton, Webster, Dean, and Hill all are "limit unbiased." That is, if the state populations are selected independently at random from some continuous and bounded probability density, the number of states is held fixed, and the total #seats is taken to infinity, then (almost surely) the bias (measured as a correlation coefficient) will tend to 0. But Jefferson and Adams are not limit unbiased; they favor large- and small-population states, respectively.
Near the quota: A method is "near the quota" if no transfer of a seat from one state to another can bring both states nearer to their exact ratios. (Birkhoff 1976 instead called this "binary consistency.") Balinski & Young (their theorem 6.2) proved Webster to be the unique population-pair-monotone method that is always "near the quota." Further, they proved this regardless of whether "nearer" is measured either in terms of additive or relative difference. But Ernst (his theorem 4.3) countered with an analogous theorem indicating that Huntington-Hill was "near the ideal" in the sense that for it, no transfer of a seat from one state to another can diminish certain other measures of departure from ideality. Hamilton/Vinton also is "near the quota," but (Birkhoff 1976 pointed out) Balinski & Young's quota-obeying method is not.
Uniformity: Balinski & Young [Proc. Nat'l. Acad. Sci. USA 73 (1980) 684-686] defined an apportionment method to be "uniform" if when we restrict to a state-subset, the apportionment within this sub"country" still is a valid apportionment under the same apportionment rule. They showed every method which is both uniform and population-pair monotone, must be a "divisor method." They also showed no method exists that obeys both uniformity and quota.
A "divisor method" gives a state S_{k} = r(P_{k}/Q) seats where r is the appropriate rounding function for that method, and Q is a "divisor," i.e. a number chosen to make the total number ∑_{k} S_{k} of seats come out right. (Here r is a monotonic-increasing integer-valued function of the real numbers.)
ADAMS' METHOD: r(x) = ⌈x⌉ [most favorable to small-pop states]
DEAN'S METHOD: r(x) = if x > hmean then ⌈x⌉ else ⌊x⌋ where hmean=2/(1/⌈x⌉ + 1/⌊x⌋)=2⌈x⌉⌊x⌋/(⌈x⌉+⌊x⌋)
HUNTINGTON-HILL'S METHOD: r(x) = if x > gmean then ⌈x⌉ else ⌊x⌋ where gmean=√(⌈x⌉⌊x⌋)
WEBSTER'S METHOD: r(x) = ⌊x+½⌋;
equivalently
r(x) = if x > amean then ⌈x⌉ else ⌊x⌋ where
JEFFERSON'S METHOD: r(x) = ⌊x⌋ [most favorable to large-pop states]
Note Hill, Dean, and Adams "naturally" enforce a one-seat minimum. E.V.Huntington in 1928 [The Apportionment of Representatives in Congress, Transactions Amer. Math'l. Soc. 30,1 (Jan 1928) 85-110 (pdf); this method had however also been considered earlier by Sainte-Laguë in 1910] showed that these "classic five" were the only divisor methods in which no transfer of a seat from one state to another can decrease some "measure of inequity" where this measure was required by Huntington to be an absolute value of the difference of two rational functions of the two state's populations and seat counts, with degree≤1 numerator or denominator in each, and such that equal seat/population ratio would zero this difference. However, Huntington did not show (contrary to misleading claims in B&Y's book) this for more-general "measures of inequity" such as more-complicated rational functions or non-rational functions. Therefore I deny the idea that these classic five methods are somehow uniquely favored and all other divisor methods should be excluded from consideration.
Although it is not a "divisor method," Hamilton/Vinton can be thought of in a similar way to those methods if we use an additive rather than multiplicative, "divisor." That is, you find a real value Δ such that if it is added to all the ideal (non-integer) seat-counts, then they are rounded off to nearest integers, you get the correct total number of seats.
Unfortunately, the divisor methods can fail – that is, a divisor that leads to the right total number of seats can simply fail to exist. Hamilton's method (if rephrased as a pseudo-divisor method as above) also can fail. For a simple example, consider 5 seats to be divided among 2 states with exactly equal populations. Any apportionment will necessarily be unfair (e.g. 3 and 2) and the divisor methods simply cannot do that. But with some random tiebreaking in such circumstances, we are ok; and in practice exact population ties are extremely rare and have never occurred, so this has not been an issue. [In our computer simulations no such failure ever occurred.]
J.Q.Adams, T.Jefferson, A.Hamilton, and D.Webster were famous statesmen in early US history and devised the methods named after them. Edward V. Huntington, Joseph Hill, and James Dean were American mathematicians. The Europeans later encountered the same mathematical problem but in guise II, and their mathematicians and politicians invented their own solutions. Some of the European solutions were equivalent to the American ones:
Jefferson = d'Hondt (idea also known to Henry Droop): After all the votes have been tallied, successive quotients are calculated for each list. The formula for the quotient is V/(s+1), where:
Webster = Sainte-Laguë = "method of odd numbers": like the d'Hondt method, but with a different denominator in the quotient formula, namely 2s+1 in place of s+1. (Incidentally, Webster could instead be defined using s+½ as the denominator. Rescaling a denominator sequence by any positive constant scale factor, does not change the system.)
Other denominator sequences have been suggested:
Estonia (non-proportional) | 1 | 2^{0.9}≈1.866 | 3^{0.9}≈2.688 | 4^{0.9}≈3.482 | 5^{0.9}≈4.257 | 6^{0.9}≈5.016 |
---|---|---|---|---|---|---|
d'Hondt=Jefferson(s+1) | 1 | 2 | 3 | 4 | 5 | 6 |
Sainte-Laguë=Webster=Odd numbers(2s+1) (used in Latvia) | 1 | 3 | 5 | 7 | 9 | 11 |
Modified Sainte-Laguë (Norway, Sweden, Nepal) | 1.4 | 3 | 5 | 7 | 9 | 11 |
Danish | 1 | 4 | 7 | 10 | 13 | 16 |
Imperiali (used in Ecuador) | 2 | 3 | 4 | 5 | 6 | 7 |
Adams=Smallest divisor | 0+ | 1 | 2 | 3 | 4 | 5 |
Huntington-Hill=Equal proportions | 0+ | √2≈1.414 | √6≈2.449 | √12≈3.464 | √20≈4.472 | √30≈5.477 |
Dean=Harmonic mean | 0+ | 4/3≈1.333 | 12/5≈2.400 | 24/7≈3.429 | 40/9≈4.444 | 60/11≈5.455 |
Macau (non-proportional) | 1 | 2 | 4 | 8 | 16 | 32 |
New(d) based on r(x)=⌈x-d⌉ rounding function | d | d+1 | d+2 | d+3 | d+4 | d+5 |
Of these, Imperiali is very favoring to large-population states (strictly more than Jefferson). Modified Sainte-Laguë favors large states and disfavors small ones slightly versus ordinary Sainte-Laguë; it differs from it only for states (or parties, in the party-list usage) having ≤1 congressman. Danish favors small states. The Estonia and Macau methods actually are nonproportional and favor large and small states respectively. Every divisor method can be rewritten as an algorithm of d'Hondt type; you simply make the d'Hondtian denominator sequence be the same as (or any positive constant multiple of) the sequence of "rounding thresholds" for the divisor method.
Let S_{k} be the number of seats for state k and P_{k} be its population.
For all j,k, Adams minimizes |S_{j} - S_{k} P_{j}/P_{k}|.
Dean minimizes |P_{j}/S_{j} - P_{k}/S_{k}|
Huntington-Hill minimizes |S_{j} P_{k}/(S_{k} P_{j}) - 1|
Webster minimizes |S_{j}/P_{j} - S_{k}/P_{k}|
Jefferson minimizes |S_{j} P_{k}/P_{j} - S_{k}|
Among all possible ways to assign non-negative integers to the S_{k} summing to the desired value S,
Lemma 7.1 (p.117), lemma 6.1 (p.94) and theorem 6.1 (p.95) of the book Evaluation and Optimization of Electoral Systems, by Pietro Grilli di Cortona, Cecilia Manzi, Aline Pennisi, Federica Ricca, and Bruno Simeone [SIAM Monographs on Discrete Mathematics and Its Applications, Philadelphia, 1999] are powerful tools for devising and proving more global optimality theorems. Assuming they are correct and I've used them correctly (which is not certain so please regard the following as preliminary), we can prove the following new results:
In 2016 Toby Pereira conjectured something more complicated than, but equivalent to, the following – and I confirmed it in the nonrigorous sense that my computer failed to find a counterexample – the following new optimality property of Jefferson=d'Hondt:
Then I realized this conjecture is implied by the statement #3 just above it, in the case d=1-. Indeed, given that S=∑_{k}S_{k}, P=∑_{k}P_{k}, and the P_{k} are regarded as fixed, more generally it implies that
both are globally minimized by the divisor method arising from the rounding function r(x)=⌈x-d⌉ where d is an arbitrary constant with 0≤d≤1.
Balinski & Young in their 1982 book Fair representation, and Sainte-Laguë 1910, both provided proofs of the Owens 1921 theorem about Webster. When I looked them up I immediately saw that their arguments generalize immediately to prove this general-d theorem. The resulting proofs have the same validity (or lack thereof) as their original proofs.
Geometric interpretation of Webster: The vector V of ideal (albeit not necessarily integer) seat-counts for the N states is an N-vector constrained to lie on the hyperplane with sum=#seats. We wish to replace it with a lattice point, i.e. all-integer N-vector, which also lies on this hyperplane. Webster chooses the hyperplane lattice point nearest to the line that passes through both V and the origin 0.
Geometric interpretation of Hamilton/Vinton: It simply finds the closest hyperplane lattice point to V (actually not only in the usual L_{2} distance, but also in any L_{q} distance for any q≥1).
Here's another method (the "Hilbert method" to put a name on it): find the nearest hyperplane lattice point, but "nearest" in the Hilbert projective metric rather than the usual Euclidean metric. This metric's distance function is
giving the Hilbert distance between two N-vectors X and Y with all-positive entries.
In other words, for all j,k, Hilbert minimizes
New claim: the Hilbert and Huntington-Hill methods are equivalent. (Incidentally, it is better than this, in the sense that it is not only the max that matters; if we remove that max, then the next max also gets minimized by Huntington-Hill's apportionment of the remaining seats, and so on recursively.)
Based on computer numerical experiments, Webster seems to be, on average over a large number of different randomized situations, the "least biased" of the above divisor methods... and we have a new method which can argue to be even less biased... but all of the methods discussed above, including Webster and the new one, are "biased" in some situations. That is, we can always set up an artificial scenario in which the "small states" always get more than their fair share of seats and the "big states" always get less (or the reverse) with any of these methods.
However, it is possible to devise a truly unbiased method by using randomized rounding.
Randomized rounding refers to the idea of rounding a number like 3.7 upward to 4 with probability 0.7 and downward to 3 with probability 0.3. (More generally you round a number with fractional part x, 0<x<1 upward with probability x.) Note that this preserves the expected value (here 3.7) of the original number, i.e, in that sense is exactly unbiased. We can choose the quantity we wish to be unbiased, round it in (an appropriate version of) this manner for every state, and then if the right total number of seats results, we quit (otherwise we do it again with new randomness).
This idea will truly be unbiased – but unfortunately it is randomized, not deterministic.
Another idea, suggested to me by Raphfrk, is to make corrections over time. That is, we keep track for each state of its historical "bias budget." If that state has historically had too many seats, we round it down next time; if it has historically had too few seats, we round it up next time. There are various ways to do that, but the point is that averaged over time (and assuming the state-population changes are not insanely wild) such a technique guarantees exact unbiasedness.