Point-count pairings?

Nah, I don’t think so. All the factors you mention (colors, scoregroups, last round $ pairings, transpositions) are important. To choose the best pairings, you’d need some kind of point-count system (remember Point Count Chess?) in which each set of proposed pairings would be compared against “raw pairings” to determine desirability.

By “raw pairings” I mean VERY raw – crude pairings determined only by score and rating. In the raw pairings, colors would be ignored, and players could even play the same opponents twice. The purpose of raw pairings would be simply to provide a standard – albeit a low one – against which other sets of pairings could be measured.

Compared to raw pairings, each player in the actual pairings would have a “transposition score” – the rating difference between his actual opponent and his “raw” opponent.

Each pairing in a proposed set of pairings could be assigned an “undesirability score” as follows:

(a) The transposition score of the less-transposed of the two players in the pairing, PLUS

(b1) 80 points if there is a color alternation problem, OR

(b2) 200 points if there is a color equalization problem, PLUS

(c) Additional points if the two players are in different score groups, PLUS

(d) Zillions of additional points if the players have already met in a previous round.

Note that (a) is consistent with 29E5c, “Evaluating transpositions”, and that (b1) and (b2) are consistent with the 80-point rule and 200-point rule (29E5a and 29E5b) respectively.

Ignoring possibilities (c) and (d) for the moment, let’s look at a sample score group consisting of four players, none of whom have played each other:

2000 WBWB Anne 1900 BWBW Bob 1600 WBWB Charlie 1570 BWBW David
The “raw” pairings would be:

[code]Anne vs Charlie (80 undesirability points because of b1)
Bob vs David (80 undesirability points because of b1)

Total undesirability points: 160[/code]
If you transpose Charlie and David (as any TD or program would do), you’d have:

[code]Anne vs David (30 undesirability points, Anne’s transposition score)
Bob vs Charlie (30 undesirability points, Bob’s transposition score)

Total undesirability points: 60[/code]
So in this case the transposition would be made. (Of course you’d give Anne and Charlie the white pieces.)

Change David’s rating to 1350, and now the transposed pairings would work out as follows:

[code]Anne vs David (100 undesirability points, David’s transposition score)
Bob vs Charlie (100 undesirability points, Charlie’s transposition score)

Total undesirability points: 200[/code]
This is worse than the raw pairings’ 160, so in this case the raw pairings would be used instead.

So here we have the beginnings of a “point count” system that could be used by a TD to determine the desirability of a transposition. Odds are WinTD already does something along these lines.

More later?

Bill Smythe

What if a point-count system is applied to the following situation, where there are six players in a score group?

2000 WB vs 1500 WB
1900 BW vs 1450 BW
1800 WB vs 1430 WB

The above are the “raw” pairings – crude pairings determined only by score and rating, ignoring colors, and allowing players to play the same opponent twice. (But let’s assume none of these six players has already played any of the others.)

The colors are bad on all three boards. They can be improved on two of the three by transposing the 1450 with either (A) the 1500 or (B) the 1430. Which is better? A is a 50-point transposition, B a 20-. So B is preferable.

In point-count terms:

Raw pairings:

2000 WB vs 1500 WB     80 undesirability points (bad color alternation)
1900 BW vs 1450 BW     80 undesirability points (bad color alternation)
1800 WB vs 1430 WB     80 undesirability points (bad color alternation)

Total:                         240 undesirability points

Pairings A:

2000 WB vs 1450 BW     50 undesirability points (50-point transposition)
1900 BW vs 1500 WB     50 undesirability points (50-point transposition)
1800 WB vs 1430 WB     80 undesirability points (bad color alternation)

Total:                          180 undesirability points

Pairings B:

2000 WB vs 1500 WB     80 undesirability points (bad color alternation)
1900 BW vs 1430 WB     20 undesirability points (20-point transposition)
1800 WB vs 1450 BW     20 undesirability points (20-point transposition)

Total:                          120 undesirability points

So the nod goes to pairings B over the other two.

(Note: The above pairings do not show white on the left, black on the right. They show the higher-rated player on the left. Of course, in making these pairings we would, for example, give the white pieces to the 1430 and black to the 1900.)

Bill Smythe

Now let’s look at a more complicated (and perhaps slightly controversial) situation. Let’s say there are six players in the score group:

2000 WB vs 1500 BW
1900 BW vs 1480 WB
1800 WB vs 1410 BW

These are the “raw” pairings. They look perfect, right? Oops, well, no – let’s say the 2000 has already played the 1500.

Should the 1500 be transposed with (A) the 1480, or (B) the 1410?

Some would argue that (B) is illegal, being more than an 80-point transposition when a smaller transposition is available. This argument would say that, since color alternation is less important than a 90-point difference, we should choose the pairing with bad color alternation over the one with a 90-point transposition.

Others would argue that, once you do (A), which is necessary to prevent two players from being paired again, but which involves two bad colors, you can then make further transpositions for color. After the 1500 is switched with the 1480 to avoid the same opponent twice, the 1480 can then be switched with the 1410 (70 points), and finally with the 1500 (20 points in the opposite direction – net transposition 50 points). So we end up with pairings (B) after all.

Let’s let point-count pairings settle this:

Raw pairings:

2000 WB vs 1500 BW     50000 undesirability points (same opponent twice)
1900 BW vs 1480 WB     0 undesirability points
1800 WB vs 1410 BW     0 undesirability points

Total:                           50000 undesirability points

Pairings A:

2000 WB vs 1480 WB     100 undesirability points (colors 80, transposition 20)
1900 BW vs 1500 BW     100 undesirability points (colors 80, transposition 20)
1800 WB vs 1410 BW     0 undesirability points

Total:                           200 undesirability points

Pairings B:

2000 WB vs 1410 BW     90 undesirability points (transposition)
1900 BW vs 1480 WB     0 undesirability points
1800 WB vs 1500 BW     90 undesirability points (transposition)

Total:                           180 undesirability points

So pairings B turn out to be preferable, despite the 90-point transposition.

One way to look at this is that the 90-point transposition was made only partly because of color alternation. The other part of it (20 points) was to avoid the same opponent twice.

Bill Smythe

Bill, if you’ve got that much spare time on your hands, I’ve got a project you can help test out. :slight_smile:

Hmm. Mike, are you writing a pairings program or something? :slight_smile:

Bill Smythe

Nope, but I’ve got lots of other tasks on the worklist, one of which is getting close to going into live beta testing.

E-mail me off the forums if you’ve got time in the next few days.
mnolan@uschess.org

Anyway, to continue the main thrust of this conversation …

The full concept of “undesirability score” would have to run something like this:

The undesirability score of a proposed set of pairings would simply be the sum of the undesirability scores of all the individual pairings in the set.

The undesirability score of an individual pairing would be:


(a) The transposition score of the less-transposed of the two players in the pairing, PLUS

(b1) 80 points if there is a color alternation problem, OR

(b2) 200 points if there is a color equalization problem, OR

(b3) 2500 points for Very Bad Colors, PLUS

(c) Additional points if the two players are in different score groups (3000 for a half-point difference, 8000 for a full point, 18000 for 1.5 points, etc), PLUS

(d) 50000 points if the players have already been paired against each other.


Some notes on the above:


(a) The transposition score of a player in a proposed set of pairings is simply the rating difference (always taken as positive) between the player’s actual opponent and his “would-have-been” opponent in the “raw” (extremely crude) pairings.

(a) (continued) The transposition score of a pairing is the smaller of the transposition scores of the two players. This is in keeping with 29E3c, Evaluating transpositions.

(b1) This must be 80 points to be consistent with 29E3a, the 80-point rule.

(b2) This must be 200 points to be consistent with 29E3b, the 200-point rule.

(b3) This must be large enough to be larger than any rating difference likely to be encountered, to be consistent with 29E3f, Colors in a series, which states that score groups should not be broken apart merely because of color problems, no matter how severe. Very Bad Colors means three in a row the same color, or assigning the wrong color to a player who is already out of balance by two or more, such as giving W to a player with WWBW.

(c) This must be larger than (b3) to ensure that score outranks color as a pairing criterion. Further, the penalty for a full-point score switch must be more than twice as great as for a half-point score switch. Otherwise, a TD (or a program using a point-count algorithm) might be tempted to pair a 4-0 against a 3-1, leaving the 3.5 group intact, rather than pairing a 4-0 against a 3.5, and a 3.5 against a 3-1.

(d) This must be huge, in order to massively discourage pairing players against the same opponents twice.


Bill Smythe

My head hurts!!! :laughing:

Aspirin will help! :bulb:

Only if Bill holds the aspirin firmly between his palms, so he can’t type. :slight_smile:

Mike Nolan

Well, guys, instead of holding aspirin between your knees, I was hoping you’d mention a pairing situation you’ve been in, either as a player or as a TD, where you were wondering whether a different set of pairings would have been better. Then we can run it through the point-count analysis. (I’ll do that last part myself, if you don’t want to.)

Anyone?

Bill Smythe

Additional considerations:

Eventually it is better to pair two opponents a second time rather than going far down multiple score groups (I’ve done it in 30-player 10-round speed tournaments). Thus a 2.0 difference might be 40,000 points and a 2.5 difference might be 85,000 points. The farther you are into the tournament the more acceptible re-pairing two players might be, so you may have an sliding penalty depending on how many rounds have been played (possibly the penalty you listed with a 10% reduction to the 50,000 points for each round after four rounds after you last paired them together).

For individual/team tournaments, there are team considerations as well, which should be large for pairing teammates, but less than pairing out of the score group (or possibly more than 0.5 points out of the score group). 2000 or 7000 points might be a good number.

In the final round, color alternation can optionally be minimized (though three in a row or a wildly lopsided distribution would still be very bad colors). This could become useful in a tournament with an odd number of rounds.

This one is very questionable, but is included in the discussion because sometimes it does a person good just to be able to vent even if there is no plan to actually take the action mentioned:
If there are players who are chronic complainers about justifiable pairings, they can get a 0.001% adjustment to any point count involving them (only coming into play when there are multiple possibilities with the same total point count). It can be an increase for lower rated players (decrease otherwise) if you want to reduce the number of times they complain, or it can be a decrease for lower rated player (increase otherwise) if they have been annoying you so much that you feel they should finally have a reason to complain. You could even announce in advance that unjustified complainers have an automatic decrease/increase (more for the more vocal) and are thus more likely to have adverse switches involving them when there are multiple equally legal possibilities.

Love it! Love it! :laughing: :smiling_imp: :laughing: :smiling_imp: :laughing: :smiling_imp: :laughing: Heh heh.

Bill Smythe

Bill,

I found extremely interesting what you have posted. The method is very clear and maybe can be codified easily, at least if one has a general purpose algorithm for maximum weighted matching. This algorithm are useful to solve similar problems in graph theory. Snjólfur Ólafsson [ Weighted matchings in chess tournaments, J. Opl. Res. Soc. 41, 1 (1990), 17-24 ] claims to have been able to reproduce the swiss FIDE pairing (Lim variant). This method is so flexible that can generate each swiss pairing just weighting in an appropriate manner each pair or as you say, assigning them an “undesidarable score” (u.s.).

From point of view of the u.s. the FIDE swiss pairing has some disadvantage because has no clear rule as USCF for transposition (80 point rule, 200 point rule ecc…). So the u.s. for FIDE swiss must be determined by mean of trial ed error as Olafsson did in his paper.

You have determined the u.s. in several situations. Please let me ask you some more questions (if they appear silly please consider that I’m not an expert of USCF system).

  1. have you already written a code that realize your method? (I would like to know if the u.s. you have choosen have shown their effectiveness in real situation, in particular I’m referring to (b3) (c) and (d) )

  2. You speak about transposition. What about interchanges? Are they automatically included in your list?

  3. If I have understood you treat the pairing as a whole and not scoregroup by scoregroup. This could produce a pairing that a TD hardly can understand although it can be better than what he can prepare. This could even produce some perplexity in some purist that would like a pairing in principle close to the rule: group by group, top half vs bottom half and then color improvement. Any way, it seems to me (but perhaps I overlook something) that treating the pairing as a whole make the problem more difficult because of the lack of a fixed “raw opponent” for each player in presence of an odd man (odd group). If the raw opponent is not fixed once for all we loose a valid reference point to which refer the transposition. Moreover it is important even his position with respect the coloumn to evaluate correctly the interchange. See question 3.

  4. If the group is odd one player should drop. How your method work in this case? In other words, which is the raw opponent of the uneven player to whom refer the u.s. during transposition?

I like very much your method and I would like to try to code it. So far I’ve written a code that tries to simulate what the TD would do, but I think the future belong to this mathematical method in that the solution it found cannot be improved by nobody even by the most “chronic complainers”.

Thank you very much for your attention,
Luigi Forlano

No, but I’d bet dollars to donuts (which, come to think of it, is only about an even bet these days) that the WinTD pairing program already does something essentially along these lines. Probably SwisSys too.

Somewhere in these forums there have been discussions about real-world situations, where the TD overruled the pairings made by the programs. Usually, in such situations, the pairings devised by the TDs were (in my opinion) inferior to those produced by the programs. That’s why I started this thread in the first place.

This is one area that would need further attention. (So far, my point-count suggestions should be viewed as a conversation-starter, not as a complete package.) In the case of interchanges, one should ask, for example, should there be an additional undesirability score attached to an interchange, or should it just be the same as the score for a transposition of the same size?

That’s true – and it’s why TDs should be wary of tinkering with pairings produced by a program.

This one is easy. The raw (extremely crude) pairings are produced by a simple algorithm. Just pair top half vs bottom half, ignoring colors, and ignoring opponents who have already played each other. If there is an odd number, just pair the lowest in one group against the highest in the next group, again ignoring colors and already-played opponents.

Please don’t try to program any of this just yet – it’s a work in progress. Furthermore, you’d just be re-inventing the WinTD wheel. I mainly intended point-count pairings to be a way of resolving “which is the better pairing” questions when they come up in this forum.

Bill Smythe

OK, I apologize for the fact that this thread is covered with so much dust that it’s likely to cause an outburst of sneezing.

I found this ancient thread searching the forum for something else that linked to this thread. I thought it was quite interesting. I’m just wondering, Bill, if you ever did anything more with it?

That’s basically how WinTD does the pairings, except that it’s way–way–way more complicated. (If one’s head hurts by reading what Bill wrote, the guts of the WinTD scoring function would cause one’s head to fall off). The major difficulty with handling pairings by (approximately) minimizing an undesirability function is that we have, in effect, a set of lexicographic preferences: don’t dupe pairings (unless necessary), don’t split score groups (unless necessary), don’t pair teammates (unless necessary), etc. Mathematically, there’s no way to represent those with a simple function that always works. In order for an undesirability function to never dupe a pairing if unavoidable, the cost of a duplicated pairing must be high enough relative to the cost of pairing out-of-score-group that any collection of the latter will always be smaller. To make sure that happens, the relative costs adapt to the size of the overall tournament and the size of the largest score group. It took a lot of tweaking to get the pairing algorithm to where it is now.

Thank you, Ken and Tom, for reviving this thread. I was surprised to see it pop up among currently active threads.

I never really carried it further (obviously, Tom Doan goes much further in WinTD) but at one point I was going to suggest varying, for example, the 80-point limit depending on how bad the non-alternation is. Perhaps WBWB vs WBWB could indeed be 80 points, while WBWB vs BWWB might be only 50. (In this case the second player is “almost” due black anyway.)

In fact, a table could no doubt be constructed that would assign an undesirability score to every possible pair of color sequences, in such a way that no two different pairs of color sequences would have identical undesirability scores. Then we’d have an algorithm!

Bill Smythe