Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I follow RL from the sides (I have dabbled with it myself), and have seen some of the cool videos the article also lists. I think one of the key points (and a bit of a personal nitpick) the article makes is this:

> Thus far, every attempt at training a Trackmania-playing program has trained the program on one map at a time. As a result, no matter how well the network did on one track, it would have to be retrained - probably significantly retrained

This is a crucial aspect when talking about RL. Most of the Trackmania AI attempts focuses on a track at a time, which is not really a problem since they want to, given an individual track, outperform the best human racers.

However, it is this nuance that a lot of more business oriented users don't get when being sold on some fancy new RL project. In the real world (think self-driving cars), we typically want agents to be way more able to generalize.

Most of the RL techniques we have do rather well in these kinds of constrained environments (in a sense they eventually start overfitting on the given environment), but making them behave well in more varied environments is way harder. A lot of beginner RL tutorials also fail to make this very explicit, and will e.g. show how to train an agent to find the exit in a maze without ever trying it on a newly generated maze :).



By the end of the article, and in the subsequent article, they're no longer doing it one track at a time.


At first I thought you were talking about some Rocket League AI stuff haha




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: