Just to be clear the "scalable" part here is assuming the dataset of human demon...

itissid · on Nov 30, 2023

Like in Workflow Guided explorations like: https://arxiv.org/pdf/1802.08802.pdf, for example, if you said "Forward the email" the Bot needs to sample from a set of actions which in this case would be DOM elements and the JS API to call to click the right button to forward the email. This would be a harder/interesting to test out. Can it learn to forward an email on firefox desktop browser and then do it on an iphone?

It feels like opening a drawer once learn't, the bot has a (world?) model of what all drawers might look like, so it can open different drawers in any world. But this might not generalize to web interfaces and more specifically, how to do those actions on those interfaces?

Not to take away from what this paper's scope and achievements.