Junior Elo Ratings Article

I think computers have the potential to produce far more accurate ratings than ELO-based ratings systems do.

Keep in mind that with both US Chess and FIDE ratings, all that the computer knows is that X played Y and the result was Z. Nothing about how or why that result occurred. Just the final result.

Computers have the potential to analyze every move a player makes and come up with a more accurate, probably multi-dimensional, assessment of the player’s skills. Opening book knowledge, middle game tactical and strategic ability, endgame knowledge, those are four skills that could be measured, largely independently of each other, and there may be others.

Back when I was actively playing, my peak rating was 1651. However, I had far too many games where I had a winnable middle game against A players and Experts, but couldn’t handle transforming that advantage into a winnable endgame and then actually winning that endgame. That’s why an old chess saw says that the hardest thing to do is to win a won game. A former Nebraska player, who went on to become a Master, once called me ‘the most dangerous C player in Nebraska’.

“If only…”. Woulda, coulda, shoulda… all sorts of excuses and what might have beens are possible with regard to individual games. I really doubt that computers will be able to handle the multivariate ways a game could have gone if only the stars aligned properly. What a waste of computer time, too.

The original Elo work was designed to be only an approximation of performance. To expect perfection from the rating system, and that it truly measure “strength” of the player is a big ask given how variable performance of humans in a variety of sports and games is on a day to day basis, and even a game to game basis. I am satisfied by the approximation. I know that I can do better tomorrow, even if body, mind, horoscope, age, biorhythms, and the flutter of the wings of a butterfly in the Amazon may work against me.

Don’t forget hormones, blood sugar, enzymes and probably a dozen other factors that influence your mind and body.

(I’ve often thought that blood sugar readings done just before the round might produce better pairings in beginner youth events than ratings from several months ago.)

The value of a multi-dimensional computerized assessment of your chess skills is probably more in how it helps you plan what to study, just like medical lab tests can help your doctor plan your medical care.

I did not forget those other elements that affect mind and body. Sleep is also an important part, which the sleepless silicon monsters do not have to worry about. If I have my Dunkin or Panera coffee before a round, things generally go pretty well. Some dark chocolate and a banana are also useful and unlikely to set off WADA alarms. I am less optimistic concerning the ability of a multidimensional computer assessment to determine strength of the player or help with planning of study. The complexity of that is beyond the capacity of the machines of today. But it is not beyond the capability of humans to do that. New research has just begun to scratch the surface of what we are truly capable of doing, both when calm and under stress.

Chess has been used as a model to examine and test general decision-making, political decision-making, and economic decision-making (See Herbert Simon and “satisficing”), as well as for the design of tree-like thinking that is used in programming. It is clear that there are cross disciplinary influences that have affected the way we now look at chess. Computers have led us to think in more streamlined ways, if our cognitive capacities tend to be that way. Whether the computers have enhanced our creativity is still a question. We have more information to process, so it makes sense to use a device to help with that processing. Computers do not engage in the same type of pattern recognition that we do. The evaluative systems used depend on mathematics. Players depend on more than that to make finer distinctions of compensation and overall evaluation of positions.

I recall a conversation with some computer chess people over numbers and values. I have had similar discussions with kids over the value of a Queen. Books vary. Computer people will give you all sorts of numbers, like 9.74 or 9.200. I usually tell the kids the Queen is worth 10. They will argue, having read the books, that the Queen is worth 9. I usually pull out books by Lasker, Tarrasch, and Franklin K Young and let them read what those individuals posited as the value. Suitably confused, the next question is “So, what is the real value?” I tell them that after reading the pages on the values of the pieces, to forget about that and just play chess to develop your own sense of the way to play. When I play, the Queen is often more a 10 than theirs is. Over time and with experience they will be able to determine the values in every position of each piece. Humans can do that. Computers still have trouble and make mistakes because they are not as flexible as we are in both remembering and forgetting what is important.

Computers are able to follow false assumptions and bad analysis with greater intensity and seeming purpose than we can. We sense something is wrong and can often pull back from the brink. Computers will send the rover over the cliff to a catastrophic end because all of the math and the sensors tell it to. We usually will walk up to the crevasse and say, “Hmm. There must be a better path than this one.” Finding combinations and strategical plans engage not only our thinking, but also our emotions. There is a tendency to undervalue our emotions when we are thinking, but they are a part of the evaluative process in pretty much everything we do. That may be one of our evolutionary edges, at least against non-sentient silicon beasties. An emotionless AI sets off chills for how really stupid, ruthless, or indifferent it may act to everything around it.

I agree and addressed this in relation to Chessbase covering tournaments and providing the percentage of good moves (in the engine’s view) made by each player. There are alternative was to measuring chess playing strength other than win, loss, draw.

We all know the actual value of a piece is contingent on the specific position. Given the right pawn structure, a knight can be worth more than a rook, a queen and knight can combine more effectively than a queen and bishop, and two bishops are generally stronger than a bishop and knight. We tend to regard this sort of thing as exceptional, but there seems no reason a computer couldn’t construct a much more exhaustive collection of positional characteristics (pawn structures, piece configurations, king locations, etc.) for assigning an expected value to a given piece even before engaging in concrete calculation. It may be that what we called the value of a piece may be just a convenient heuristic given human limitation, analogous to what’s happened to older positional “laws” regarding knights-before-bishops or open-files.

I think there is a need to quantify value of pieces, but only in the context of teaching chess to beginners. I mean, sorta like inside their first few lessons. Clearly many chess players continue to use piece value throughout their chess playing lives, but it’s more to keep a mental scorecard on trading pieces in positions.

For example, one could take a look at a position that seems ripe for plundering, but full of pitfalls, and might mentally decide if they could get a serious advantage of, hypothetically, trading their queen for say, a rook and bishop or rook and a knight, or keep playing conservatively and see if the position improves without going for a sudden attack.

Obviously I’m not talking about a specific tactic where giving up or trading the queen down can be calculated with certainty for an immediate win, or at least clearly leaving your opponent with the worse position, but rather when your deciding on more strategic grounds, where you can only decide that not having the queen would give you a speculative advantage for offensive reasons.