Hacker Newsnew | past | comments | ask | show | jobs | submit | chillfox's commentslogin

I don’t really consider myself an “AI enthusiasts”, but I do use it.

So, agents tend to do better the more feedback they can get. Type checking is pretty good for catching a bunch of dumb mistakes automatically.

The point is more hints for the agent is more better most of the time.


So just like for humans...

Well, that's super disappointing :(

I have got xAI blocked in OpenRouter as I do not want to support any business controlled by Musk.


I used to work for a sports betting company that identified individuals who were a little too good. The key is to remember that they are addicts and will bet on events regardless of if they have insider knowledge or not, so you have to account for this and not only identify the individuals with insider knowledge, but also what events they have that knowledge about and what they don't.

Do people here and the WSJ comprehend though that they are giving Polymarket free publicity to farm addicts?

This is like saying that "Supersize me" is free publicity for MacDonalds.

No such thing as bad publicity? I would bet that McDonalds saw an uptick in sales after the release of SSM.

every casino's business model is about get-in-the-door appeal in order to fish out a handful of very lucrative whales. polymarket included. so no they're not really comparable.

Meanwhile, I can’t get kimi k2.6 to edit a heredoc in a shell script without it fucking it up.

Surely "Claude Opus 4.7" and "Claude Opus Latest" should be the same, right?

Yeah, so often people just mention "Opus" or "GPT" without a version, and those get mapped to the "-latest" suffix.

I thought I'd keep these as a rating for model families rather than specific models. But tbh it's probably better to remove them, too confusing.


"tolerate" would be the better word to describe it

Affordable ram!

I recently bought one for my k3s cluster, and it was the cheapest 16g ram I could get by a decent margin.


Not the one you were asking, but…

I have been using Opus (in zed) to find the “in between” bugs. Bugs that kinda live in the space between micro services or between backend and frontend.

It takes a bit of preparation to get good results, but it can usually find the source of bugs in 1-2 hours (200k-300k context) that would take me a week to track down.

I create a folder, and then open up git worktrees in sub folders for every repo I think might be involved. I also create an empty report.md file. Then I give it a prompt that starts with “I need you to debug an issue”, followed by instructions for how to run tests in each repo, followed by @mentioning any specific files or folders I think is relevant (quick description of what they are), then the bug description. After that I tell it to debug the issue, make no code changes and write its findings to the report.md file.

This works incredibly well.


I have been eyeing off the ollama and minimax plans, but I just don’t know how to compare them. Ollama especially, I have no idea how much usage I could get out of a plan.

Also, just learned about opencode go from other comments here, so gotta look into that.


I put $20 on Mistral and Deepinfra several years ago, and it’s still there.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: