Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just to be clear the "scalable" part here is assuming the dataset of human demonstrations is available it can in a closed loop learn better than before? In other words, am I right in understanding that the mapping from Abstract actions to concrete steps in the State space is provided at a low "enough" cost?


Like in Workflow Guided explorations like: https://arxiv.org/pdf/1802.08802.pdf, for example, if you said "Forward the email" the Bot needs to sample from a set of actions which in this case would be DOM elements and the JS API to call to click the right button to forward the email. This would be a harder/interesting to test out. Can it learn to forward an email on firefox desktop browser and then do it on an iphone?

It feels like opening a drawer once learn't, the bot has a (world?) model of what all drawers might look like, so it can open different drawers in any world. But this might not generalize to web interfaces and more specifically, how to do those actions on those interfaces?

Not to take away from what this paper's scope and achievements.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: