Wednesday, November 24, 2021

Carlsen vs. Nepomniachtchi 2021 - and a new statistical model!

The match will be 14 games instead of 12. This is an improvement. In longer matches, there is a better chance that the stronger player wins (see my articles in ChessBase and Chess Life for more information). However, it is still too short. In "Reforming the Candidates Cycle," I ran simulations and recommended 24-game matches.

Carlsen's rating is 73 points higher. This gives him a big edge. The forecast from the traditional model:

Carlsen wins: 82.86%

Nepomniachtchi wins: 9.115%

Match Drawn: 8.025%


However, the traditional model is due for an update. Sometimes it assumes a draw rate that is too low. I had taken games from my database and plugged them into a statistical model (see the methodology section for more details). However, most of the games didn't come from elite tournaments. A battle between 2400s doesn't tell us much about what will happen in the world championship. The model did correct for the ratings and acknowledged that stronger players draw more often. However, the adjustment wasn't fully satisfactory. It expects the draw rate between Carlsen and Nepo to be just 50.1569%! We could get more decisive games in this championship than in the last three combined!


Now for the paragraph that most readers will want to skip. I built a new model using games where both players were 2700+. Rapid, blitz, and online games were dropped. An ordered logit estimated the probabilities of win, loss, or draw. The independent variables were a second degree polynomial of white's rating, black's rating, and year (higher order terms were insignificant). Then I compared the model's draw rate to the actual draw rate in a subsample: both players 2750+. In this group of about 4000 games, the model expected 57.4% of them to be drawn; the actual draw rate was 58.4%. Not bad. Earlier I tried a larger database with both players 2600+ or 2500+, but in those samples, the predicted draw rate for 2750+ was too low. I tried various ways to put more weight on games with stronger players, but the predicted draw rate barely budged. The other main difference is that the traditional model uses the expected score formula to generate the probabilities of a win or loss. The new model uses the ordered logit to obtain these probabilities.


At the end of the day, the new model's predictions are quite similar:

Carlsen wins: 81.34%

Nepomniachtchi wins: 9.395%

Match Drawn: 9.265%


I will be updating the new model's forecast after each game

No comments:

Post a Comment