Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Great point about the meaningful datasets, this makes perfect sense. Esp. in regards to SFT and RLHF. Although I suppose it would be somewhat easier to do pretraining on really long context (books, I assume?)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: