Isn't the general method of prevention just to train a bigger model for longer, so that all these niche edge case exploits get found and addressed during policy exploration?
If there's an exploit that's sufficiently rare and unpredictable, then that seems like the only way (and indeed it should be a sufficient way, if done right) to address it.
It worked in adversarial board games (Go, Chess) and poker. We now have unexploitable bots for these games.
It hasn't worked yet in Starcraft because the strategy space is so much larger and the action space is also much larger. The networks are too small relative to this space, and humans can still put the bot into a situation it can't handle.
I'm going to guess that Starcraft will end up like these other games once the hardware etc advances another 5-10 years, and we'll have an unexploitable bot. The main reason I'm thinking this is we have unlimited training data, unlike with self-driving. We can make the models arbitrarily good.
The bot still won't have an ounce of common sense beyond what it's trained to do. It's just that it will have been so exhaustively exposed to every nook of the search space that a human won't be able to find any exploits.
If there's an exploit that's sufficiently rare and unpredictable, then that seems like the only way (and indeed it should be a sufficient way, if done right) to address it.