robertkarl's comments

robertkarl · 2026-04-27T21:50:54 1777326654

I am trying to figure this out too... what I am seeing is that the local models like Qwen 3.5 family that fit on hardware like yours handle ambiguity poorly. But are capable of emitting complete apps too.

That, and they have tool use issues.... https://www.reddit.com/r/LocalLLM/comments/1smzw6s/qwen35_a3...

I would check out the model mentioned in that thread, GGUF unsloth/qwen3.5-35b-a3b on Q4_K_M

frabcus · 2026-04-27T22:48:17 1777330097

Qwen 3.6 is out now and a touch better than 3.5.

I'm finding Google's Gemma 4 even better though - seems to hold up the agentic loop better than Qwen.

All will load into 20Gb of VRAM. None are amazing, but they do just about work.

robertkarl · 2026-04-26T18:39:05 1777228745

PocketOS's website says "Service Disruption: We're currently experiencing a major outage caused by an infrastructure incident at one of our service providers. We are actively working with their team on recovery. Next update by 10:00a pst."

This is wrong. It was not an infra incident at their service provider.

As Jer says in the article, their own tooling initiated the outage. And now they're threatening to sue? "We've contacted legal counsel. We are documenting everything."

It is absolutely incredible that Jer had this outage due to bad AI infra, wrote the writeup with AI, and posted on Twitter and here on his own account.

As somebody at PocketOS instructed their AI in the article: "NEVER **ing GUESS!" with regards to access keys that can touch your production services. And use 3-2-1 backups.

Good luck to the rental car agencies as they are scrambling to resume operations.

8note · 2026-04-27T16:09:01 1777306141

itll be entertaining if someone points at this thread as "the operator has no idea what they are doing and followed 0 best practices for software engineering, and anti-patterns for agentic ai"

robertkarl · 2026-04-21T22:34:54 1776810894

That also was really opaque to me RE: API access. I initially thought at $200/month I could get whatever I needed. I eventually set up a OpenAI API with a few bucks to try what I wanted to.

robertkarl · 2026-04-21T21:49:41 1776808181

I don't think I've ever been on such a rollercoaster with a company's reputation in the developer space. I started in January on the $20 plan, essentially my first agentic AI programming. I quickly started hitting limits developing several apps at the same time. I went up to the $200 plan after seeing the value.

After seeing my own issues with 4.6 and the mega-post on Github about declining metrics in a decent dataset of claude chats by Stella Laurenzo at AMD (https://github.com/anthropics/claude-code/issues/42796), I downgraded to the $100 plan. Hallucinations. Laziness. Lack of thinking. The responses on those mega-threads from Anthropic rubbed me the wrong way in a "you're holding it wrong" kinda way.

In the past week, I downgraded back to the $20 plan because the Codex $20 plan on 5.4 was working so well for me.

Then throw in other oddball events like the source code leak, and the super positive Anthropic events like their interactions with the current administration. It's a wild ride.

I can't understand removing Claude Code from $20. I'm interested to see whether this is confirmed or not.

I'm a career engineer and I went from being one of their most outspoken proponents (at least within my circle) and now.... I'm not.

jmcodes · 2026-04-21T21:55:08 1776808508

Same loved them, told my team about them, got them to switch off of cursor, now I'm telling them to swap to Codex.

Anthropic really pissed me off with their harness crap. They're well within their rights but their communication over it was enough to get me to swap. I don't need extra hurdles when there's a perfectly valid alternative right there. They don't have the advantage they think they do.

operatingthetan · 2026-04-21T22:07:44 1776809264

I think we are inevitably heading to using the cheap Chinese models like Kimi, GLM, and Minimax for the bulk of engineering tasks. Within 3-6 months they will be at Opus 4.6 level.

robertkarl · 2026-04-21T22:11:18 1776809478

This was literally my task today, to try out Qwen 9B locally on my, albeit a bit memory-constrained at 18GB, macbook with pi or opencode. Before reading this update.

operatingthetan · 2026-04-21T22:14:04 1776809644

Minimax coding plan is $10 a month for roughly 3x the $20 Claude Pro CLI usage allowed. That would be good place to start. 200k context though.

jorjon · 2026-04-21T22:26:36 1776810396

MiniMax has its own issues. Server overloads, API errors, and failure to adhere to even the system prompt. It can happily work for hours and get no job done.

sincerely · 2026-04-22T08:41:49 1776847309

Just like me :)

someuser54541 · 2026-04-21T22:22:09 1776810129

Please report back, would be very interested in your findings.

sshine · 2026-04-21T23:09:51 1776812991

I ran OpenCode + GLM-5.1 for three weeks during my vacation. It’s okay. It thinks a lot more to get to a similar result as Claude. So it’s slower. It’s congested during peak hours. It has quirks as the context gets close to full.

But if you’re stuck with no better model, it’s better than local models and no models.

I have to say, OpenCode’s OpenUI has taught me what modern TUIs can be like. Claude’s TUI feels more like it’s been grown than designed. I’m playing around with TUI widgets trying to recreate and improve that experience

TacticalCoder · 2026-04-21T23:25:27 1776813927

> I have to say, OpenCode’s OpenUI has taught me what modern TUIs can be like. Claude’s TUI feels more like it’s been grown than designed.

Claude's TUI is not a TUI. It's the most WTF thing ever: the TUI is actually a GUI. A headless browser shipped the TUI that, in real-time, renders the entire screen, scrolls to the bottom, and converts that to text mode. There are several serious issues and I'll mention two that do utterly piss me off...

1. Insane "jumping" around where the text "scrolls back" then scrolls back down to your prompt: at this point, seen the crazy hack that TUI is, if you tell me the text jumping around in the TUI is because they're simulating mouse clicks on the scrollbar I would't be surprised. If I'm not mistaken we've seen people "fixing" this by patching other programs (tmux ?).

2. What you see in the TUI is not the output of the model. That is, to me, the most insane of it all. They're literally changing characters between their headlessly rendered GUI and the TUI.

> Claude’s TUI feels more like it’s been grown than designed.

"grown" or "hacked" are way too nice words for the monstrosity that Claude's TUI is.

Codex is described as a: "Lightweight coding agent that runs in your terminal". It's 95%+ Rust code. I wonder if the "lightweight" is a stab at the monstrosity that Claude's TUI is.

taikon · 2026-04-21T23:23:24 1776813804

To be clear, was OpenCode a better in your opinion compared to ClaudeCode?

sshine · 2026-04-22T21:24:11 1776893051

Better UI, worse model (GLM), probably slightly worse agentic runtime.

In spite of how glitchy Claude feels, it makes decisions fast.

robertkarl · 2026-04-21T23:23:49 1776813829

For what it's worth: here's my experience in the first 10 minutes of using Qwen locally to write some code. https://github.com/robertkarl/local-qwen-first-10-minutes it includes some token generation numbers and steps to repro.

hank2000 · 2026-04-21T22:14:53 1776809693

how was it? I'm doing this today

robertkarl · 2026-04-21T22:27:46 1776810466

I will report back... but I have to recommend this comment on a post about Qwen 3.6 https://news.ycombinator.com/item?id=47843466 by daemonologist

it goes into detail about llama-server args; quants to try; and layer/kv cache splits. I plan to try the techniques there.

try-working · 2026-04-21T22:26:48 1776810408

Kimi K3 in July-September is the big one.

muyuu · 2026-04-22T00:43:17 1776818597

Kimi 2.6 works roughly like Opus 4.6, when it used to work. Depending on the task, a bit better or a bit worse. And it's MUCH cheaper.

toasty228 · 2026-04-22T12:21:59 1776860519

From this morning: I had a single go file with like 100 loc, I asked it to add debug prints, it thought for 5+ minutes, generating ~1m output token and did not actually update my file.

slopinthebag · 2026-04-22T16:32:26 1776875546

Which harness? Did you use OpenRouter?

maxnevermind · 2026-04-22T00:16:09 1776816969

Anthropic will kick and scream as those are often distilled from their latest models and is cutting into their margin. Though it is not like their hands are clean neither, it is just a different type of stealing, an approved one :-)

kzisme · 2026-04-21T23:20:20 1776813620

How challenging are these to setup locally and have them running?

operatingthetan · 2026-04-22T00:18:54 1776817134

Getting them running is easy (check out LMstudio or ask one for some recommendations). The real question is whether you have the hardware to make them run fast enough to be useful.

kzisme · 2026-04-22T04:13:45 1776831225

The min req is probably crazy I assume but I'll take a peek :)

AussieWog93 · 2026-04-22T22:07:15 1776895635

This is possibly a hot take but recently I've been having about as much luck with Composer 2 in Cursor as I have with Opus 4.6 in Claude Code.

Opus is obviously the better model, but Cursor's "harness" is doing so much heavy lifting in terms of just magically supplying the broader context the model needs to understand the ramifications of its edits.

robertkarl · 2026-04-21T22:19:24 1776809964

One thing I enjoy about Cursor and Codex mac apps is the embedded preview window. I know it's not as hardcore as the terminal/tmux but it's hella convenient. But Cursor bugs me with the opacity around what model I'm using. It seems deliberately to be routing requests based on its perceived complexity. What draws you to codex vs cursor?

zormino · 2026-04-21T22:29:25 1776810565

I think removing Claude Code from the $20 tier is a terrible idea, I never would've gone from nothing right into the $100/200 tier. The $20 plan let me get my feet wet and see how good it could be, and in less than a week I was on the $100 plan.

I think they need to at least have a 1 month introductory rate for the max plan at $20, or devs that decide to try out agentic coding just won't go to Anthropic.

That leads to downstream impacts, like when a company is deciding which AI coding tools to provide and the feedback management hears everyone is already used to (e.x.) Codex, then Anthropic starts losing the enterprise side of things.

siva7 · 2026-04-21T22:47:14 1776811634

They're not losing anything. They have much more demand than they could ever fulfill to care anymore about promotional or subsidized user groups.

bsder · 2026-04-22T05:25:47 1776835547

Until magically all their demand vaporizes.

I suspect a lot of people are like me. They got into this at the $20/month level individually to check things out. I'm not stressing things out, so I haven't moved up, but the moment I bump into a limit, I'll pull the trigger by default. Until then, I'm the sleeping dog, and you should let me lie.

Well, Anthropic decided to kick me. Now, I'm investing the time to figure out how to use the "open" and "Chinese" models assuming that Anthropic is about to screw me. Once I switch, Anthropic is going to have to demonstrate significant improvements over what I'm now using to get me to even consider them again.

tverbeure · 2026-04-22T07:16:12 1776842172

I don't think they need the $20/month users when there are some who use over $1000 in tokens per day.

bsder · 2026-04-22T23:00:46 1776898846

I know of 4 companies that are already starting to stomp down on the AI whales using the "$1000 per day". There was the cost of the entire AI usage of the company and then there was the cost of about a half-dozen people who dwarfed it individually.

So, we've established a hard upper ceiling for what AI can extract per user at roughly $100K and more realistically at $10K per year. Basically, if using the AI costs the same as a human salary, it's going to get pushback. I mean, the whole point was to get rid of those pesky human salaries, after all.

So, there are about 2 million-ish software jobs in the US? It's more than 1 million but a far cry from 10 million. So that pencils in at $20 billion in the US per year total? That means that if an AI company literally won all the US software programmers, it would be worth max $200 billion to be bought out (10x revenue).

Now how much investment have the AI companies taken? Yeah, roughly that. And investors are going to want quite a bit more than that back.

Even if they had zero delivery costs, the AI companies are cooked long term. The moment your number bumps into "All the X in the US/World", you've got a problem.

Short term? Greater fool theory applies. And there appear to be a lot of them.

And all this is before we start getting into people exploring the open models. Most people were like me; we started on something like Claude and just stayed put because it was straightforward. Now that we've been kicked, we'll start looking at the other options.

wek · 2026-04-22T01:47:45 1776822465

I agree. Why would they not keep the $20 plan as a gateway drug?

Yizahi · 2026-04-21T22:29:43 1776810583

LLM monsters are deeply unprofitable, going by the industry hearsay (which is the only thing we have, given ultra secrecy of the LLM corporations). The only two LLM companies which disclosed their finances without lies, were two Chinese corporations and they, unsurprisingly, were deeply in red.

Remember the old saying about boiling a frog? LLM corporations need to make most of their users pay hundreds per month, asap. This is Anthropic increasing temperature regulator under the pot just a tiny little bit. Not the first and not the last time.

andrekandre · 2026-04-22T00:04:39 1776816279

  > LLM corporations need to make most of their users pay hundreds per month, asap.

it would explain why tech is so hard on forcing it down everyones throats (need to get that scale asap and hope it holds)

manoDev · 2026-04-22T04:06:50 1776830810

Their price point goal is a SWE salary.

HWR_14 · 2026-04-22T05:37:07 1776836227

I assume the Chinese corporations can operate in the red forever and be subsidized by the Chinese government.

simoncion · 2026-04-22T07:09:22 1776841762

That's true of any company in any country. If you can convince the government that your company is sufficiently important, you can get subsidized.

HWR_14 · 2026-04-22T15:34:57 1776872097

My point was that the AI companies in China have already convinced the government to subsidize them.

simoncion · 2026-04-25T10:03:06 1777111386

Tax breaks, fee reductions/waivers, direct monetary incentives, and shielding from "unfavorable" regulation -whether local, state, or national- are all subsidies. Hell, depending on the particulars, government contracts can be subsidies... there's more than one government engineering project out there that could be reasonably referred to as a "jobs program for PhDs", and still more that are corporate handouts.

Every business believed by its "home" government to be sufficiently important gets subsidies when it asks for them... regardless of what nation houses that government. If your claim is that the major players in the "AI" industry aren't getting subsidies from local, state, and national governments in the US, then my claim is that you are lying.

allarm · 2026-04-23T04:11:10 1776917470

Good for them.

eleventen · 2026-04-21T22:25:45 1776810345

Matches my experience very well. All the goodwill earned from taking a stand against the DoD seemingly forgotten in a month. Coincidentally, I canceled my pro subscription and got set up with OpenCode and OpenRouter last night.

bsder · 2026-04-22T05:27:59 1776835679

Got any good pointers to documentation for making the transition? I'd like to pull the trigger for OpenCode and OpenRouter as well.

eleventen · 2026-04-22T13:11:27 1776863487

I just installed it and continued about my business. I don't have a carefully tuned claude code setup. It has some skills, some CLAUDE.md files, and not much else.

I did also migrate an app from Claude SDK to Pydantic AI to get off claude API pricing. https://pydantic.dev/docs/ai/overview/

alwillis · 2026-04-21T22:28:16 1776810496

> I can't understand removing Claude Code from $20

Not according to their webpage: "Claude Code is included in your Pro plan. Perfect for short coding sprints in small codebases with access to both Sonnet 4.6 and Opus 4.7." [1]

[1]: https://claude.com/product/claude-code

eleventen · 2026-04-21T22:31:01 1776810661

There are clear contradictions across their marketing site. As others have pointed out, it's being removed from some help articles and the pricing chart now shows it revoked. Confusing signals, but they seem to be changing all pages in this direction and haven't updated that one yet.

See https://news.ycombinator.com/item?id=47854478

htrp · 2026-04-22T01:01:26 1776819686

what happened to agentic superintelligence based development?

jmalicki · 2026-04-22T11:08:40 1776856120

This is superintelligence. The mixed signals are tested to increase their revenues. Superintelligent AIs wouldn't be honest.

surgical_fire · 2026-04-22T08:20:19 1776846019

> I can't understand removing Claude Code from $20. I'm interested to see whether this is confirmed or not.

Anthropic bleeds money per user. No matter if it's the $20 or $200 plan, every Claude Code user is unprofitable.

The only way to not bleed money is to eventually move everyone to API pricing. Hiring a personal senior engineer will likely be cheaper.

lionkor · 2026-04-22T08:46:13 1776847573

> I quickly started hitting limits developing several apps at the same time

> I'm a career engineer

I'm trying really hard here to be nice, but what the hell are you doing? Are you vibe coding multiple apps in parallel and calling it engineering?

Is it like those people who eat 2-3x the amount of meat to ensure they offset the positive impact 1-2 vegans are having? :D

zem · 2026-04-23T01:28:40 1776907720

I've noticed that there are some people who feel that their claude instance could be working on churning out multiple apps, and therefore if it isn't they are in some sense falling behind. it's the illusion of productivity raised to the level of a minor addiction.

elschneider · 2026-04-21T21:55:51 1776808551

I had a similar ride, but disagree with your conclusion. Opus 4.7 is so incredibly powerful from my experience, that nothing else really matters and I think at Anthropic they know it. People will pay a lot for access to this model.

operatingthetan · 2026-04-21T21:57:14 1776808634

>Opus 4.7 is so incredibly powerful from my experience,

I'm not challenging your opinion, but this is an outlier in the general current public opinion about it.

adam_th · 2026-04-21T22:30:59 1776810659

This is one of the most civil disagreements I've ever seen on the internet and I intend to start using this myself

elschneider · 2026-04-21T22:09:03 1776809343

Yea, I've seen a lot of whining online, because its more expensive, but from the interactions I've had I'd say, that it's well worth it. To me it feels like another step change, similar to when 4.5 was introduced. Definitely a different beast.

EDIT: it is also surprising to me that everyone seems to believe the people at Anthropic are simply incompetent and recklessly risking their good reputation, while very few consider the possible good reasons they might have for taking such drastic measures. And I don't think it's because of financial pressures in their case

user34283 · 2026-04-21T22:18:50 1776809930

I can’t say I’ve used it extensively enough to draw a conclusion, but it did seem similar to GPT 5.4 in Codex.

When I threw it at a difficult issue in an iOS app, it like GPT came up with wrongly guessed explanations. It only found the issue after I had it instrument the app and add extensive logs. Usually GPT 5.4 is the same.

Only that with GPT 5.4 it’s at least included in my subscription, while sending 3-4 messages to Opus 4.7 for this blew through my $20 plan limits and consumed $10 of extra usage on top. At that point I can’t help but bring up how much more expensive it is.

weikju · 2026-04-22T01:46:36 1776822396

> Only that with GPT 5.4 it’s at least included in my subscription, while sending 3-4 messages to Opus 4.7 for this blew through my $20 plan limits and consumed $10 of extra usage on top. At that point I can’t help but bring up how much more expensive it is.

Rest assured OpenAI won’t want to leave that kind of money on the table…

user34283 · 2026-04-22T08:28:23 1776846503

There’s also still Google with their TPUs, xAI has some large models in the works, not to mention China.

With that much competition and ongoing improvements, I don’t have such a pessimistic view on future usage limits and cost.

beering · 2026-04-22T06:05:11 1776837911

Having too much usage and not enough GPUs is a form of financial pressure, no? since you want to replace your less profitable customers with more profitable customers

Foobar8568 · 2026-04-22T05:44:45 1776836685

I personally prefer 4.7 to 4.6 or previous model.

cjbconnor · 2026-04-21T22:17:55 1776809875

I've had completely the opposite experience. I've asked for it to research things and it's just told me to "paste xyz into google". Just now I revisited a chat that's 5 days old and asked it to check again (because what I was looking for might have changed), and it said "no".

elschneider · 2026-04-21T22:33:50 1776810830

It's funny how experiences can be so different. I wonder if this comes down to context. My interactions so far were fairly high-level and in some cases it having a strong opinion was actually super beneficial to the outcome. To me it seemed opinionated, but in a very good way. I can see how this could backfire though and have heard similar reports.

cjbconnor · 2026-04-21T22:36:31 1776810991

Curious to know what plan you're on? I was on the max 5x plan, but downgraded to pro a few days before the opus 4.7 release.

elschneider · 2026-04-21T22:37:51 1776811071

I'm on max 5x

Oras · 2026-04-21T22:19:55 1776809995

Incredible, powerful, but I couldn't believe how fast I hit the limits compared to how it was with Opus 4.6. They removed Opus 4.6 completely from CC. I would prefer it with the previous limits.

That's not how you keep your customers. None of these agents have a moat, I moved away from Cursor when they started doing what Anthropic is doing now, and never went back even when I was a paying customer since the start.

conception · 2026-04-22T01:02:08 1776819728

You can just use the model parameter to bring it back

bdelmas · 2026-04-21T23:14:30 1776813270

Opus 4.7 may be incredible but for how long? And they may have Mythos but I feel like they will put it out if pressed too much by their competitors. And again for how long will they keep the advantage?

At the speed everything is advancing I don’t think it’s such an advantage. They all catch each other up pretty fast. That’s why I prefer to pay Cursors and have access to all of them instead of being lock to a single one (even if that means to lose some discounted credits). If they opened Mythos today at a good price that would be something but that’s not the case and it won’t happen.

sidrag22 · 2026-04-21T22:26:46 1776810406

they need the devs on board for that to matter, i can get whatever i want done with lesser models already. It is quite literally about just who is not gonna give me the shittiest experience, and at anthropic it sure seems they are determined to annoy everyone since they started gaining in popularity.

strangattractor · 2026-04-21T21:58:07 1776808687

[flagged]

hannahstrawbrry · 2026-04-21T22:09:24 1776809364

FWIW their "Fin" agent on their web page tells me: "Claude Code is not included in the standard Pro plan. It's only available with premium seats on Team and Enterprise plans. However, Pro plans do include improved coding capabilities through our GitHub integration, which lets you connect code repositories directly to Claude. You also get access to code execution and file creation features in the Claude apps. For the full Claude Code terminal experience, you'd need to upgrade to a Team or Enterprise plan with premium seats"

strangattractor · 2026-04-21T22:18:12 1776809892

Even odder. I asked: Are you sure?

Yes, confirmed directly from Anthropic's website. Claude Code is included in the Pro plan, described as perfect for short coding sprints in small codebases with access to both Sonnet 4.6 and Opus 4.7. Claude

You said: Is claude code included with the pro subscription Is claude code included with the pro subscription

3:11 PM Claude responded: Yes, Claude Code is included with the Pro plan ($20/month). Yes, Claude Code is included with the Pro plan ($20/month). You can use it in the terminal, desktop app, VS Code, JetBrains, and on the web.

The main caveat is that Claude Code draws from the same usage limits as your regular Claude chat — so heavy coding sessions can eat into your quota faster. If you find yourself hitting limits often, the Max plans ($100–$200/month) offer significantly more headroom.

Thats a head scratcher:)

ricardobeat · 2026-04-21T22:04:49 1776809089

The $20/month, which is now $17, should give you a hint? Models have a knowledge cut-off and will not reflect up to date information unless you trigger a web search.

operatingthetan · 2026-04-21T22:06:29 1776809189

That's just a billing trick.

>$17 Per month with annual subscription discount ($200 billed up front). $20 if billed monthly.

civvv · 2026-04-21T22:03:56 1776809036

Do you understand how LLM's work and that they are always behind in their knowledge? Unless Claude does a network call to check its own website, it will give you outdated information. Its a prediction model, its not magic.

EstanislaoStan · 2026-04-21T22:03:13 1776808993

Claude often doesn't know the truth about Claude Code etc. lol

strangattractor · 2026-04-21T22:06:13 1776809173

I got down voted by 4 people just for pointing that out. lol