Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A bottleneck can be running the model only on GPU, where CPU is more efficient. But most of the bottlenecks are memory issues. GPUs do not necessarily have enough memory and so you end up having to access "external memory" that slows down forward pass a ton


Also, in some cases like small RNN/LSTMs, CPU's can be faster.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: