Information scientists develop method to detect doping cases using AI
Searching for a needle in a haystack or tilting at windmills are both good metaphors for the challenge of detecting doped athletes. With thousands of athletes competing at major sporting events such as the Olympic Games, World Championships or in professional leagues such as football, it can take a laboratory weeks to analyse urine samples to determine whether any of those competing had taken performance-enhancing drugs. 'At the moment, the samples are all analysed manually,' said Wolfgang Maaß, Professor of Information Systems for the Service Industry at Saarland University and Scientific Director of the Smart Service Engineering research department at the German Research Center for Artificial Intelligence (DFKI).
Given the sheer number of athletes at major events such as the Olympic Games – there are around 10,500 in Paris – and how time-consuming current testing methods are, it's not hard to see that many cheaters are simply slipping through the net. Only a fraction of the urine samples can be analysed in the laboratory. As we know from the doping scandal at the 2014 Winter Olympics in Sochi, some of the athletes who cheat try to swap their own urine samples with ‘clean’ samples provided by someone else.
Until now, DNA analysis has been the only reliable method to identify whether samples have been swapped. 'But that’s both expensive and time-consuming,' explained Wolfgang Maaß. It is simply not possible to analyse the DNA of every single sample. Maaß and other colleagues from the DFKI (German Research Center for Artificial Intelligence), the German Sport University Cologne and the World Anti-Doping Agency (WADA) decided to pool their expertise to search for a simpler, more viable solution. 'This problem is practically crying out for machine analysis,' said Maaß.
To address the issue, they developed software that uses artificial intelligence to analyse urine sample data both quickly and cost-effectively. 'Doping tests measure the concentrations and ratios of various steroids, which are then checked for plausibility,' explained Wolfgang Maaß. This provides a biochemical fingerprint that the Saarbrücken AI software can use to reliably flag up any anomalies.
The machine-learning program only needs the data from three urine samples provided by each athlete over the course of their sporting career. As the natural steroid profile of one athlete may be very different to that of another, the program learns which concentrations of specific substances are typical for that particular athlete. For each sample, seven characteristics such as steroid concentrations and their ratios are determined in the biochemical laboratory. And rather like a child doing a spot-the-difference picture puzzle, the software searches for deviations from the usual pattern.
'If you compare the three or more "pictures" with the measurement data from the individual urine samples, the software will find the ones where everything matches,' said Wolfgang Maaß, explaining in simple terms how the computer program works. This leaves a residual number of samples where the 'pictures' do not match up, i.e. where inconsistencies have been detected. 'The small number of remaining cases can then be examined in more detail by biochemists in the laboratory using DNA analysis. If an athlete has taken a performance-enhancing substance and that substance can be detected in urine, then our software can help to identify that athlete with a high degree of certainty,' said Wolfgang Maaß.
Rather than detecting doping offenders directly, the software is designed to identify clean athletes with 99 percent confidence to ensure that innocent people are not wrongly accused. While this can mean that a small number of doping offenders go undetected, positive doping cases in which athletes have taken banned substances in order to go higher, farther or faster are identified with a very high degree of certainty. 'Anyone who has doped can almost certainly be found among these remaining cases, which can then be investigated in more detail using DNA testing,' explained Wolfgang Maaß.
Original publication
Maxx Richard Rahman, Lotfy Abdel Khaliq, Thomas Piper, Hans Geyer, Tristan Equey, Norbert Baume, Reid Aikin, Wolfgang Maass. "SACNN: Self Attention-based Convolutional Neural Network for Fraudulent Behaviour Detection in Sports."; International Joint Conference on Artificial Intelligence, IJCAI, (2024).