Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Been testing Deepseek R1 for coding tasks, and it's really impressive. The model nails Human Eval with a score of 96.3%, which is great, but what really stands out is its math performance (97.3% on MATH-500) and logical reasoning (71.5% on GPQA). If you're working on algorithm-heavy tasks, this model could definitely give you a solid edge.

On the downside, it’s a bit slower compared to others in terms of token generation (37.2 tokens/sec) and has a lower output capacity (8K tokens), so it might not be the best for large-scale generation. But if you're focused on solving complex problems or optimizing code, Deepseek R1 definitely holds its own. Plus, it's incredibly cost-effective compared to other models on the market.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: