Isn't the general method of prevention just to train a bigger model for longer, ...

ZephyrBlu · on March 25, 2021

That is the obvious answer, but I have no idea if it's true in practice.

hntrader · on March 25, 2021

It worked in adversarial board games (Go, Chess) and poker. We now have unexploitable bots for these games.

It hasn't worked yet in Starcraft because the strategy space is so much larger and the action space is also much larger. The networks are too small relative to this space, and humans can still put the bot into a situation it can't handle.

I'm going to guess that Starcraft will end up like these other games once the hardware etc advances another 5-10 years, and we'll have an unexploitable bot. The main reason I'm thinking this is we have unlimited training data, unlike with self-driving. We can make the models arbitrarily good.

The bot still won't have an ounce of common sense beyond what it's trained to do. It's just that it will have been so exhaustively exposed to every nook of the search space that a human won't be able to find any exploits.