What do you mean by draft model? And how would one disable it? Cheers | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		marcalc 11 months ago \| parent \| context \| favorite \| on: Running Qwen3 on your macbook, using MLX, to vibe ... What do you mean by draft model? And how would one disable it? Cheers

_neil 11 months ago [–]

A draft model is something that you would explicitly enable. It uses a smaller model to speculatively generate next tokens, in theory speeding up generation.

Here’s the LM Studio docs on it: https://lmstudio.ai/docs/app/advanced/speculative-decoding

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact