Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How do you propose that punishment make the AIs "get better"? From their perspective they're already as good as they can be, based on their training.


Reinforcement Learning can train a model based on some reward function. The suggestion is that real-world accountability could be translated into such a reward function.

Also, OP explicitly mentioned "online learning", which is a continuous training process after standard pre-training.

For what it's worth, I don't think this would work. Rewards would come in too sporadically to be useful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: