A new method of AI systems – training could make them safer from hackers
11. Juli 2020A new method of AI systems – training could make them safer from hackers
New York, 11.7.2020
One of the greatest unresolved shortcomings of deep learning is its vulnerability to hacker attacks. If you add these seemingly random or imperceptible interferences to the input of an AI system, things can get completely out of hand. For example, stickers strategically placed on a stop sign can trick a self-propelled car into seeing a speed limit sign at 45 miles per hour, while stickers on a road can trick a Tesla into turning into the wrong lane.
Most enemy research focuses on image recognition systems, but deep learning based image reconstruction systems are also vulnerable. This is particularly worrying in the healthcare sector, where the latter are often used to reconstruct medical images such as CT or MRI scans from X-ray data. A targeted enemy attack could cause such a system to reconstruct a tumor in a scan where no tumor is present.
Bo Li (who was named Innovators Under 35 in this year’s MIT Technology Review) and her colleagues from the University of Illinois at Urbana-Champaign are now proposing a new method to train such deep learning systems to be more fail-safe and thus more trustworthy in safety-critical scenarios. They compare the neural network responsible for image reconstruction with another neural network responsible for generating adversarial examples, in a similar way to GAN algorithms. In iterative rounds, the opposing network tries to deceive the reconstruction network in such a way that it produces things that are not part of the original data or the basic truth. The reconstruction network constantly adapts itself to avoid being fooled, making it safer to use in the real world.
When the researchers tested their newly trained neural network on two popular image data sets, it was able to reconstruct the basic truth better than other neural networks that had been made „fail-safe“ using different methods. However, the results are still not perfect, which shows that the method still needs to be refined. The work will be presented next week at the International Conference on Machine Learning.


