Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Llama 3 70B takes half the VRAM as Mixtral 8x22B. But it does need almost twice the FLOPS/bandwidth. Yes, Llama's context is smaller although that should be fixable in the near future. Another thing is that Llama is English-focused while Mixtral is more multilingual.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: