Cunningham transposition limits, fantasy and logic

In another thread, somebody suggested that the 3rd edition (Redman) rulebook wanted color transpositions kept to under 100 points because, according to a study conducted by George Cunningham, having the white pieces in a tournament game was worth a little under 100 rating points.

I don’t dispute the results of the Cunningham study, but I reject it as a reason for keeping transpositions under 100 points.

Color transpositions are made simply to avoid unfairness in the assignment of colors. If the color unfairness is serious (e.g. two more blacks than whites), then a larger transposition should be allowed, no matter how many rating points the white pieces are worth.

But, let’s follow that logic to its extreme, and see what kind of changes would come about in pairing procedures.

First of all, if the number of rating points in a transposition is supposed to compensate for the color difference, there would be a rule that says a player due white but receiving black should get there only through a downward transposition. Likewise, a player due black but receiving white should get there only through an upward transposition.

Further, why limit this logic to players being transposed? Players receiving their “natural” pairings should also be compensated for color, no?

Therefore, for those who like the 3rd edition logic, I now propose Cunningham Pairings as an alternative to the present system:

  1. First, before actually making any pairings, decide which players will receive the white and black pieces. In general, not everybody will receive their due colors. If, for example, 10 players are due white and 6 due black, two players due white will necessarily receive black. Assign the black pieces to the 6 due black, and to 2 of those due white. Those two should be the two least due white. For example, BBW is less due white than BWB, who in turn is less due white than WBB. Or, after four rounds, BWWB is less due white than WBWB. (See 5th edition 29E4, points 1 through 5).

  2. Once each player’s colors have been assigned, temporarily add 50 points to each player assigned white, and subtract 50 points from each player assigned black. Then arrange the players in order according to these adjusted ratings, and pair top half vs bottom half, pairing each player in the top half against the first available player in the bottom half who has been assigned the opposite color.

If this creates serious problems (e.g. opponents have already faced each other), undo everything and start over at step 1. Re-assign the colors slightly differently than before, still assigning the “wrong” colors to the players least due their “correct” colors whenever possible. Then re-do step 2 as well (some players who had 50 points added may now have 50 points subtracted, and vice versa).

This method would more fully support the Cunningham theories about rating point differences vs colors. Does anybody like it?

Bill Smythe

What I think would probably be a better pairing system, although too complicated for making pairings manually, would be a system that tried to give each player in a score group the same average rating of opposition, adjusted for color. If a player got an unusually difficult pairing in one round it would be compensated for with easier pairings later, and vice versa.

In the first round, assign the bye and pair the top half against the bottom half as in the current system (although it’s not clear that the lowest rated player should get the bye - I’ll save that thought for another day). In subsequent rounds pair each score group as follows:

For each player, compute the player’s average effective opposition (AEO); that is, the average rating of the player’s opponents adjusted for color in accordance with the Cunningham scale (or a more accurate one in light of later research). This is the average of the effective opponent rating (EOR) for each round. The EOR is the opponent’s rating adjusted for color. For opponents rated under 2100 subtract 80 points if the player was white (and therefore the opponent was black) and add 80 points if the player was black. For opponents rated 2100-2399 subtract 60 points if the player was white and add 60 points if the player was black. For opponents rated 2400 and higher, subtract 40 points if the player was white and add 40 points if the player was black.

In a double round robin, theorectically the fairest pairing system, the average rating of opposition is not equal because the highest rated player doesn’t have to play himself. In a Swiss, the highest rated player shouldn’t be penalized for having a high rating by having to play higher rated players than his opponents. Therefore, when calculating AEO, assume each player faced himself as an opponent in round 0, with no color adjustment.

Now calculate the ideal effective opponent rating (IEOR) for each player. If each player in the score group plays an opponent with that player’s IEOR, at the end of the round each player’s AEO will be the same.

If the score group has an odd number of players, iterate over all “reasonable” possibilities for the odd man, in descending order of desirability. The odd man plays someone in the next lower score group who is as close as possible to the player’s IEOR. Normally the odd man will be the lowest player in the score group but a different player can be dropped to improve colors. An additional complication is that the opponent’s IEOR also has to be considered. This part of the system needs some work.

Create two lists in descending order: players from highest rated to lowest rated, and players from lowest AEO to highest AEO. Now comes the tricky part: as closely as possible, pair each player in the first column with the player at the same position in the second column. Obviously a player can’t play himself, and shouldn’t play someone he’s already played. Switches to fix colors are allowed but make a pairing less desirable when evaluating a pairing. Switches are also allowed because the order of the first list can change slightly when color is considered.

Evaluate each pairing by summing the absolute value of the differences between each player’s EOR and IEOR, including the difference for the odd man. In theory it shouldn’t matter whether a player gets his due color each round but in practice players expect to get equal numbers of white and black and they expect their color to alternate, so assign an additional negative score (TBD) for players who don’t get their due color. The pairing actually chosen should be the one with the lowest negative score.

Unfortunately this algorithm is, potentially at least, order N squared, but it should be possible to develop heuristics to reduce the number of pairings which have to be looked at in order to find a pairing which is reasonably close to the optimal one. If the number of players is small enough it might be possible to look at all possible pairings.

This is just a first stab at an idea I’ve been thinking about and I haven’t tried applying it yet to an actual tournament to see what the pairings would have been. This is, after all, a fantasy topic.

I think Bill Smythe’s “Cunningham” system and my “IEOR” (ideal effective opponent rating) system would probably produce better pairings than the current USCF system but aren’t practical to use because it would be difficult to make pairings manually. Maybe some day all pairings will be made by computer and both TDs and players will trust the computers enough that there will be no need for TDs to check the pairings before posting them, or for TDs to explain pairings to players. Until that happens we’re better off with the system that we have.

Correction: the second list should be from highest IEOR (ideal effective opponent rating) to lowest IEOR. I was thinking that this amounted to the same thing, and maybe it would if everyone played the same number of games, but I’m working through an example where a player got a full point bye in the first round. This player has the lowest rating and the lowest AEO (average effective opposition) but doesn’t have the highest IEOR.

You guys have way too much time on your hands.

– Hal Terrie

As a result of working through an example of the “IEOR” pairing system, I’ve decided to make a change to the system: I’m scrapping the rule which assumes that each player played against himself in “round 0” when calculating the AEO.

I’m not sure how to treat players who get full point byes. Perhaps the player shoud be assumed to have won against an opponent rated 100, the minimum possible rating, but this would distort the system too much. Since players who get full point byes don’t usually win prizes I’m not going to worry about giving them extra-tough opponents. I’ll treat full point byes as unplayed games which don’t affect the AEO.

The example I’m working through to test out “IEOR” pairings is the 2-day schedule of the Open section of the 80th Massachusetts Open, which was held over Memorial Day weekend. I’ll use initials to identify the players.

Results after round 1. AEO is average effective opposition, which in this case is the rating of each player’s first round opponent, adjusted for color. There are 10 players in the section but one of them took a half point bye in the first round so the lowest rated player got a full point bye. “MM” scored an upset win against “AP”.

                 Color
Player  Rating   History   Score  Opponents  AEO

IF       2515      B         1        GX    2184
MB       2139      W         1        EM    1944
MM       2119      B         1        AP    2390
TR       1994      -         1        -        -
LT       2237      -        0.5       -        -
DP       2189      B        0.5       PC    2149
PC       2069      W        0.5       DP    2129
AP       2330      W         0        MM    2059
GX       2124      W         0        IF    2475
EM       2024      B         0        MB    2199

There are four players in the 1 point score group, so there is no odd man. To compute the IEOR (ideal effective opponent rating) for each player we first have to find the average IEOR for the score group. Multiply each player’s AEO by the number of games played, sum across all players in the score group, add the rating of each player in the score group (because they’ll play each other in the next round), divide by the number of games played plus the number of players. This works out to be about 2184 if I did the math correctly (I stored the exact number in my calculator). Now work out the IEOR for each player, so that if the player’s opponent has an EOR equal to the IEOR the player’s AEO after two rounds will be 2184.

Player     Rating     IEOR

IF          2515      2183
MB          2139      2423
MM          2119      1977
TR          1994      2184

Since MM played a 2330 in round 1 he should get an easier opponent in round 2, while MB played a 2024 in round 1 and is due to get a more difficult opponent in round 2.

Now I create two lists for the 1 point score group: players in descending order of rating and players in descending order of IEOR. For each pair of opponents, figure out which player would be white and which player would be black and calculate the negative score: the sum of the absolute values of differences between the EOR and the IEOR for the two players.

Pairing             Score

IF (W) vs. MB (B)    236
MB (B) vs. TR (W)    334
MM (B) vs. IF (W)    702
TR (B) vs. MM (W)     68

The pairings with the lowest negative score appear to be:

Pairing             Score

IF (W) vs. MB (B)    236
MM (W) vs. TR (B)     68

Total negative score: 304

The pairings actually chosen by SwissSys in the tournament, using current USCF pairing rules, were:

Pairing             Score

IF (W) vs. TR (B)    640
MM (W) vs. MB (B)    346

Total negative score: 986

After getting a difficult opponent in round 1, MM deserved to get an easier opponent in round 2, but instead he was paired up. And MB got a relatively easy opponent in round 1, so according to the IEOR system’s logic he should have gotten a difficult one in round 2.

Actually, I’m not sure why SwissSys chose to make a 125 point transposition instead of a 20 point interchange. It probably has a bias against interchanges. In any case, the current pairing rules don’t give MM any credit at all for having played a 2330 in round 1. The IEOR system tries to give him easier opponents in later rounds. The same goes for any players who have to face the formidable IF.

Actually, Hal, I have important things to do and shouldn’t be wasting time on this, but Bill Smythe goaded me into posting an idea I’ve been kicking around for a while.

Well, OK. My purpose in starting this thread was to point out the fallacy in jumping from “the white pieces are worth a little less than 100 points” to “transpositions should be limited to 100 points”. For the benefit of those who still wanted to make this questionable jump, I extended the absurdity to its logical conclusion.

Apparently, so far, only one sucker has jumped in.

Bill Smythe

I don’t think it’s a fallacy. As I see it, FIDE pairings put too much emphasis on color equalization, whereas USCF pairings properly consider rating as well as due color. The USCF pairing system could be improved by attempting to equalize the average strength of opposition among players in a score group, although this would be at the expense of added complexity and isn’t feasible at the present time.

One thing I like about the FIDE pairing system is that it keeps track of which players have been dropped to a lower score group or raised to a higher one, so the same player isn’t dropped or raised every round. I’ve taken this a step further in my “IEOR” system by keeping track of each player’s average opposition rating, adjusted for color, and trying to keep this the same for all players in a score group.

It wouldn’t be added complexity; in fact, it would be simpler. But it would be at odds with the whole theory of ratings-based Swisses.

The whole point of a ratings-based Swiss is to find a set of pairings which maximizes the rating differences between players in a score group. This is why we use “top-half” versus “bottom-half” (THVBH) pairings. THVBH is a heuristic for finding the two sub-groups (“halves”) within a score group with the maximum rating difference. THVBH does this pretty well, though not necessarily optimally. The “natural” pairings, which pit the players in the top-half versus the bottom-half in rating order is a further heuristic distribute the rating difference between the top half and the bottom half evenly over the games in the score group, that is, not to “waste” the rating difference between the groups on one “overkill” game. The point of this is highest-rated players win and advance with a perfect score, while the lower-rated players are defeated. This is so that the highest-rated players don’t knock each other out too early, and so that if lower-rated players advance it is by defeating high-rated players, not other low-rated players.

If it was a good thing to “equalize the strength of opposition among players in a score group”, as you want, it could be done simply. For example, 1 versus 2 pairings, or random pairings, within a score group would both produce far more “equalization of strength” than THVBH pairings. And 1 versus 2 pairings and random pairings are both simpler than THVBH pairings, not more complex.

How would 1 versus 2 pairings equalize the strength of opposition? Player 1 would face the highest rated opponent each round, so his average opposition rating would be much higher than the average opposition of the other players in the top score group, assuming he won all his games. Random pairings would probably produce a wide diversity of average opposition strengths, although they definitely would be simpler than any other system.

Relative strength, not absolute strength. 1 versus 2 is a heuristic for minimizing rating differences between opponents. A player is likely to meet players close in strength to himself. That is not the same as minimizing the variance of opponent ratings within the pairings of a score group, which is your aim. But it is a non-goal of Swisses that the players within a score group face the same average strength opposition. Swisses, and THVBH pairings, are trying to have the higher-rated players advance with perfect scores and to have the lower-rated players lose and be removed from contention (at least temporarily). A Swiss is intended to be an elimination tournament without any absolute eliminations. A ratings-based Swiss modifies this to make it more likely that low-rated players will be “eliminated” than high-rated players.

Which is fine if your goal is just to give people games against opponents with similar ratings, but it’s not really suitable if people are playing for prizes.

It seems to me that it should be a goal, whether it is one in the current system or not.

Even under the USCF system, I keep track of upfloats and downfloats when I use pairing cards. I write an up- or down-arrow next to the opponent number whenever a player is paired up or down into another score group. That way, I can avoid pairing the same player up (or down) twice in a row.

Not that I would necessarily avoid consecutive upfloats or downfloats. If a player is paired up into a higher score group, he is probably playing a lower-rated player in that group, so the player will probably win. The way I look at it, if a paired-up player wins, or a paired-down player loses, the pairing has vindicated itself, so there is no need to avoid a second upfloat or downfloat. But it’s still nice to see that it happened.

Bill Smythe

My point is that it is currently a goal of ratings-based Swisses that the high-rated players be paired against the low-rated players so that high-rated players win and the low-rated players lose. If you are not going to do this in your Swiss, you might as well not run a ratings-based Swiss at all, and just use random pairings within score groups.

I don’t see the logic in your statement. Why should a pairing system favor one set of players over another?

That’s a rhetorical question - I don’t want to spend time arguing about this.

On the contrary. Ratings help the goals of 1-vs-2 pairings just as much as ratings help the (different) goals of half-vs-half pairings.

Bill Smythe

Ratings-based swiss pairings also generally help with colors, because, in general, higher-rated players win and lower-rated players lose. After the higher-rated players alternate colors in the first round, they tend to play each other in the subsequent rounds, with about half of them due the opposite color as a result. Of course, each upset which occurs will “upset” the perfect pairings for the next round. But still, if the higher rated player “won” the Round 1 coin toss, for example, then in general, because the higher-rated players will likely win, players 2, 4, 6 and 8, for example, would be due for White in Round 2, while players 1, 3, 5 and 7 would be due Black. If the pairings were instead random, then approximately half of those players will have instead received the opposite color than they would normally have been assigned in Round 1 (because the colors for the higher-rateds were not alternated every board), and they would then be due the same color as their opponent in Round 2.

Ratings-based pairings should, in general, help keep the colors balanced more than random pairings (which does not mean, of course, that difficult pairing problems can't still occur).