In my last post, I described the new model. Now we will look at more results, though this time I will try to stay away from all the technical jargon.
If you perform better than your expected score, then your rating goes up. But is the expected score formula accurate? The model can tell us. Here is the graph for a 2650 player facing opponents from 2600 to 2900. When a 2650 plays another 2650, the expected score is 0.5. Not surprising - when you play someone with the same rating, you have equal chances. The model and Elo's formula agree on that. But then the two lines diverge. On the far left, we have a 2650 playing a 2600. Elo's formula ("theory") says that the 2650's expected score is about 0.57. But in the data, the 2650 scores slightly worse - roughly 0.55. This means that in real life, the 2650 will lose rating points if playing weaker players. On the other side of the graph, the pattern reverses. When facing stronger opponents, 2650s perform better than their rating and gain points.

The next graph is for a 2700. It is a similar story. 2700s underperform when facing weaker opponents. However, they do better than expected against stronger players.
It's also true for 2750s:
What does this mean for the forecasts? The new model is based on the data, so it will give more weight to the underdogs. The old model was based on Elo's formula; it was overestimating the favorites. We can see this when we revisit the Carlsen-Nepo match. Carlsen was the higher rated player, so the original model gave him higher chances.
Old model
Carlsen wins the match: 82.86%
Nepo wins the match: 9.115%
Drawn match: 8.025%
New model
Carlsen wins the match: 75.535%
Nepo wins the match: 13.56%
Drawn match: 10.905%
A word of caution about interpreting the results: the model was based on games with at least one 2700 player. Will a 2000-rated amateur gain points if they play up a section? I don't know. The model was designed for elite tournaments, not amateurs.
I don't have a clear cut answer for the question in the title. Are 2700s overrated? Yes - but only when playing weaker opponents. When playing stronger opponents, they are underrated. Should FIDE fix this problem by adjusting the expected score formula? It might be more complicated than that. There is going to be a feedback effect on the ratings. There are many rating systems that are superior to Elo; it would be better to switch to one of them instead.
No comments:
Post a Comment