Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Did you set that up following a guide or anything you could share?


Easiest way I know is to just use LMStudio. Just download and press play :). Optional, but recommended, increase the context length to 262144 if you have the DRAM available. It will definitely get slower as your interaction prolongs, but (at least for me) still tolerable speed.


not OP, but I got it running on my 4090 (and RAM) by following this guide: https://unsloth.ai/docs/models/qwen3-coder-next

I see around 30 t/s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: