Cheating in Online Chess

Humans are notoriously unreliable at detecting cheating in chess by the moves alone. This presents a significant problem when games are played online and event organisers are responsible for finding the games that have been played unfairly. While there are many possible approaches that have been suggested, event organisers must follow a rigorous method that produces reliable results. Dr Kenneth Regan is a world leading expert in this field having published a number of papers on the subject, working on the FIDE Anti-Cheating Committee, and being called as an expert witness in several international cases (Regan, 2020). It is Dr Regan's methodology that the ECF, like many governing bodies, use. This article will attempt to explain how this approach works without delving too deep into the complexities of the process.


First we start by amassing a very large collection of chess games, in the order of tens or hundreds of millions, played in over-the-board events that we can presume to have a very low occurrence of computer assistance. These games will be played by players across the spectrum of possible ratings. Then we run lengthy computer analysis on each of the games' moves where a player would reasonably have to think for themselves; discounting the first n moves (typically where n is 8, or by discounting any theoretically understood opening moves). We look for how frequently a move matches with the computer's first preference, and where it is different, we make note of the difference in computer evaluation from the top choice and the moved played (Biswas, Haworth and Regan, 2015), typically measured in centipawns (100ths of a pawn's value). Using these two data points, we can build a model of how accurately a player of a given rating is likely to play and the distribution we can expect of occurrences of more or less accurate play.


When this data is collected from our set of games, we can see that for any given rating, a player's performance will fall into a normal distibution with some games falling below and others above the expected computer match-up performance:

Fig 1. An example of a typical normal distribution.


With this data in hand, we can determine statistically significant probabilities that a player would perform at a given level above or below their rating (DiFatta, Haworth and Regan, 2009), (Regan, Macieja and Haworth, 2011). The value used is referred to as the Z-value shown on the x-axis in Figure 1, which acts like a standard deviation across the curve. In normal over-the-board conditions, a player would expect their Z-value to sit between -1 and 1 in just over two thirds of their play, -2 and 2 just over 95% of the time, and -3 and 3 in 99.7% of all games played. An excellent online resource that delves into a deeper explanation of Z-values and let's you calculate the odds of specific Z-value probabilities that can be accessed here.


When a player plays in an online event, their games have the same analysis performed resulting in a Z-value for that event. Typically, and in cheat-detection in over-the-board events, the cut-off for suspect play is when a player's performance produces as Z-value greater than 4 (highlighted in Figure 2), which, with honest play, we would expect to occur less than once in 30,000 events.

Fig 2. The area in red represents a Z-value greater than 4, equating to 0.0032%.


This analysis is typically run using accumulated games several times during an event where possible, and again at the end of the event with all games. The decision of when these analyses are performed is at the discretion of the tournament organiser. Due to the nature of these tests and this method, it is extremely unlikely that you will ever be flagged as having used computer assistance in an event as the result of one particularly accurate game.


Frequently Asked Questions

Cheating Concerns

How does this method of detection work when a player getting advice from another player?


The player will not be caught with this detection method of detection, unless the assisting player happens to be using a chess engine themselves. This is still a form of cheating and cannot be condoned in typical online chess events.


How does this method of detection work when a player getting advice from a book or database?


The player will not be caught with this method of detection, unless the book were to contain extensive novel computer analysis that both players happen to follow for a significant, and unrealistic, majority of the game. This is still a form of cheating and cannot be condoned in typical online chess events.


What if I memorised some opening preparation made with engine assistance and use that in an online game?


As with over-the-board chess, computer-assisted preparation is allowed. This method will not flag a player as cheating for such preparation. The game of chess has a typical branching factor of about 35 (Burmeister and Wiles, 1995), which means that for any given position there will be around 35 legal moves available to the player. Thus after two moves by both players a typical position could result in one and a half million positions, nearly twenty million positions after three, and two trillion after four. Even factoring in that only a fraction of legal moves will be considered by a competent player, the rate of growth is still exponential and honest preparation will not consistently last long enough after a known line to give a statistically significant result.


How does this method of detection handle honest improvement in a player?


When working with Elo ratings, this method takes into consideration the rating before and after the event, as well as the tournament performance rating (TPR). With ECF grades changing more slowly, this could be seen as a concern as the grade may not change as rapidly. Fortunately, the modelling of player performances is not significantly affected by the type of change in play caused by honest improvement. While a higher rated player may have a distribution that shows a higher correlation with computer play than a lower rated player overall, the increase is only a fraction of the increase in playing strength and the distribution overlap is significant. i.e. A higher rated player being held to the standards of a lower rated player's distribution would not result in a Z-value that would trigger accusations of cheating. Consider a typical club player swapping their online account with a much stronger player for a period of time. The account would not be banned if the stronger player is playing honestly.


I've just analysed one of my games and it appears to be unusually accurate. Am I going to be accused of being a cheat?


Some games may be more comparable to engine play, particularly shorter games with more forcing moves, or endgames with known best-case solutions. This is to be expected according to the distribution within the detection method's accuracy distribution curve. Equally, you are likely to have the occasional game that is exceptionally inaccurate. You are unlikely to be flagged as a cheat as a result of one game.


I have been banned from an event, what can I do?


Most events will entertain appeals, but the outlook is not good. Boundaries for dishonest play are set very high and it is extremely unlikely that honest play will reach that level. If a mistake has been made you should appeal to the event organiser or ask your team captain to do so on your behalf. The ECF has set out restrictions for caught cheats and it is most probable that you will have to adhere to those before returning to online chess.

Other Approaches

Why don't we just insist that all games be played with video conferencing software to help detect cheats?


Aside from the problem that junior chess in particular would fall foul of the guidance from the UK Council for Internet Safety (UKCIS), there is also no guarantee that the use of a webcam will prevent cheating. At present the verdict seems to be that outside of a small number of elite events, the inclusion of video conferencing software adds significantly more problems than the gains made in cheat detection.

Website Specific

This section covers questions pertaining to online platforms used by online chess events; e.g. chess.com, chess24, lichess, etc..


Is Dr Regan's method how online platforms do their cheat detection?


Most online platforms keep their particular cheat detection methods private, but they are likely to be similar to the approach described above. While it may seem unfair, the more that is known about a particular site's methods, the easier it becomes for cheats to operate around those restrictions.


Why do online events specify that games be played as rated on the hosting platform?


Automated cheat detection protocols on a hosting platform are typically only applied when games are played as rated. Ensuring that a game is rated provides an extra level of fair play protection for participants. Most online events will specify that if a game challenge is sent as something other than rated, the opponent should reject it, send a message to explain that the game should be rated, and if necessary, send a correct challenge of their own.


My account on a hosting platform has been closed or flagged for using computer assistance, what can I do?


Most hosting platforms offer an appeal service, and while it is very unlikely, it is possible that your appeal will see the decision overturned. Most event organisers and clubs are happy to reserve judgement until after the resolution of that appeal. Events may have to progress with the decision as it stands prior to the appeal's resolution due to time restrictions.


My opponent's account on a hosting platform has been closed, does this mean that they were cheating?


While it is possible that this is the case, most hosting platforms are required to offer the option to users close their account, so an account closure does not definitely mean it was for cheating. Speak with your team captain or event organiser if you need to know, though they may not be able to disclose that information.


Citations

T. Biswas, G. Haworth and K. Regan, "A Comparative Review of Skill Assessment: Performance, Prediction and Profiling" in the proceedings of the 14th Advances in Computer Games conference, Leiden, Netherlands, July 2015.


Burmeister, J. and Wiles, J., 1995, November. The challenge of Go as a domain for AI research: a comparison between Go and chess. In Proceedings of Third Australian and New Zealand Conference on Intelligent Information Systems. ANZIIS-95 (pp. 181-186). IEEE.


G. DiFatta, G. Haworth and K. Regan, "Skill Rating by Bayesian Inference", in the proceedings of the 2009 IEEE Symposium on Computational Intelligence and Data Mining (CIDM'09), Nashville, TN, March 30--April 2, 2009, IEEE, pp 89--94.


K. Regan, B. Macieja and G. Haworth, "Understanding Distributions of Chess Performances", Proceedings of the 13th ICGA conference on Advances in Computer Games, Tilburg, Netherlands, November 2011.


K. Regan and T. Biswas and J. Zhou, "Human and Computer Preferences at Chess", in the proceedings of the 8th Multidisciplinary Workshop on Advances in Preference Handling (MPREFS 2014), associated to AAAI 2014, Quebec City, July 2014.


Kenneth Regan: Faculty Expert in Chess - University at Buffalo. 2020. Kenneth Regan: Faculty Expert in Chess - University at Buffalo. [ONLINE] Available at: http://www.buffalo.edu/news/experts/ken-regan-faculty-expert-chess.html. [Accessed 01 July 2020].


Open Letter from Kenneth W. Regan on cheating at chess | ACP. 2020. Open Letter from Kenneth W. Regan on cheating at chess | ACP. [ONLINE] Available at: https://www.chessprofessionals.org/content/open-letter-kenneth-w-regan-cheating-chess. [Accessed 01 July 2020].