Hacker Newsnew | past | comments | ask | show | jobs | submit | malisper's commentslogin

> Isn't concurrency also limited by your machines disk speed for writes, what difference does it make if you write sequentially vs concurrently? Why does concurrency even matter for databases?

For a simplified example, having three processes reading blocks X, Y, Z in parallel is much faster than having a single process read block X, wait for the read to finish, read block Y, wait for the read to finish, read block Z and wait for the read to finish.


Same but for multi-threaded Postgres[0]. 96% pg regression tests pass after 1 month and 823K LOC. 8 Codex accounts at $200/mo is what i could use up with no Mythos

I've also seen the benefits of Rust for this too. And making the bet that my pg experience will help me make good design choices around many of the things people have been having trouble with in pg for a long time[1]. Excited to see AI make it more possible to improve complex pieces of software than has historically been practical.

[0] https://github.com/malisper/pgrust [1] https://malisper.me/the-four-horsemen-behind-thousands-of-po...


1600/mo, there is now a token-rich class.


Very cool! If you have extra tokens laying around ask the agent try to break things and open GitHub issues. This is what I do for tsz and beyond conformance test I can see it finding very good bugs.


> PostgreSQL, rewritten from scratch in Rust.

You use the test suite and LLMs are trained on Postgres.

Are you at Freshpaint? A company that "helps healthcare marketing teams grow in a world where privacy is the baseline, but performance is the goal."

Nice promises! Surely the marketing teams will respect privacy!


96% tests passing sounds impressive, but I remember that C compiler that had similar (or better) stats yet was still hilariously broken because the test suite didn't cover many "obvious" things that a human wouldn't get wrong even without the tests.


There's a few big differences between the Anthropic C compiler and pgrust. The C compiler was built mostly autonomously and as a clean room implementation. OTOH I'm steering codex and using the Postgres source code as a reference. That's leading to the implementation being based more on how pg does things than anything else. If you want to try it out, I compiled it to wasm so you can try it out here[0]. You'll see it's much more faithful to Postgres than a C compiler that doesn't handle type checks.

[0] https://pgrust.com/


wow!

curious about your workflow for running all these accounts. different harnesses in parallel? manually switching in codex? 5.5pro only?

what works for you?


I wrote up a bit about my workflow here[0][1]. I'm using conductor.build to manage multiple codex sessions at once. When I hit the rate limit, I'm using codex-auth[2] to switch codex accounts.

[0] https://malisper.me/pgrust-rebuilding-postgres-in-rust-with-... [1] https://malisper.me/pgrust-update-at-67-postgres-compatibili... [2] https://github.com/loongphy/codex-auth


> what looks like a massive undertaking for vibe coding

fwiw, I suspect it's less of an undertaking than you may think. I've been playing with AI to rewrite Postgres in Rust[0] over the past couple of weeks and I found the AI to be exceptional at doing rewrites. Having an existing codebase you can reference prevents a lot of the problems you have with vibecoding. You have an existing architecture that works well and have a test suite that you can test against

Over the course of a month I've gone from nothing to passing over 95% of the Postgres test suite. Given Jarred built Bun, I bet he'll be able to go much faster

[0] https://github.com/malisper/pgrust


> I suspect it's less of an undertaking than you may think... having an existing codebase you can reference prevents a lot of the problems you have with vibecoding.

That's because it's not vibe coding - stingraycharles doesn't seem to understand what vibe coding is. Vibe coding was defined here https://x.com/karpathy/status/1886192184808149383

> There's a new kind of coding I call “vibe coding”, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.

This is very far from Anthropic's migration plans.


Yeah, it's a distinction worth making, and the language for making it kind of sucks. Vibe coding means "AI does the whole thing", or "I use tab autocomplete" depending on who you ask. It's not a very useful term anymore, we need better ones.

My benchmark is basically, "are you letting the AI drive."

In this case, an AI appears to have written the migration guide...


It was and is a perfectly good term, but people started using it without regard for its definition. I don't know why people wouldn't misuse a "better" term the same way.


In this case I think the current zeitgeist (at least among zoomers and younger millennials) really loves the word "vibe". Once they hear of the term "vibe coding", they just want to be able to say it, even if what they're doing isn't really vibe coding.

And then that leaks outside their social and age groups, because other people hear the incorrect usage, get confused, and incorporate that confusion into their own use of the term.


Waiting until they decide to call non-assisted programming ‘unc coding’


As someone who might be described as an "unc", I had to look up what "unc" meant.


i mean AI docs are usually the result of collabs between users and AI using /plan

with superpowers, i see a lot of specs -> impl plan -> execute plan


Yeah. It "might be" that a human actually looked at it. There's just no way to know anymore. So it rightly doesn't inspire confidence.


"Vibe coding" = "let Dario take the wheel" as ThePrimeagen puts it.


You are right but recently, vibe coding has become a demeaning term for AI assisted code by anti-AI people. It’s interesting seeing how words evolve very quickly on the internet as they spread to different demographics.


That is one person's definition of vibe coding, not "the definition" of vibe coding. Words have multiple meanings.


Just going off vibes and not even looking at the code was the original definition. But "different people say the same thing but mean different things" is kind of the problem I was getting at.


It’s the person that created the term’s definition.


Language and culture don't work like that.

Inventing a term doesn't give you exclusive rights to provide the definition.


Yes but it's been a little over 14 months.


I've been working on a project to build a new Postgres based database in Rust[0]. I'm four weeks in and have 93% of the Postgres test suite passing. I've found agents to have worked really well for this as I have an existing codebase that has good architecture that I can point my agents at. It's also easy to debug as I can diff what my agents are doing and what Postgres is doing.

I've had to get multiple codex accounts, but there was a brief period of time where I tried API usage to see how expensive it would be. In about an hour I spent $650 of credits. I had codex estimate how much I would be spending if I was doing pure API usage and it estimated around $10k/week.

For context Postgres is 1M lines of C code. It's looking like pgrust will come out as less lines of code than Postgres and at peak I was adding over 100k lines of code in a day. I would estimate it would take a team of 5 software engineers at least 3 years to get to where I got in a month with a couple Codex subscriptions.

[0] https://github.com/malisper/pgrust


Since there's a lot of questions about what this means, let me explain.

Anthropic has two different products that are relevant here: the Claude API and Claude Code. The Claude API has usage based pricing. The more you use, the more you pay. With Claude Code, you can get a monthly subscription which gives you a fixed amount of usage. Comparing equivalent token generation between the Claude API and Claude Code, Claude Code with a subscription is much cheaper.

When it comes to third party products such as OpenClaw and OpenCode, Anthropic has made it clear those products should be using the Claude API and not the internal Claude Code APIs. OpenClaw and OpenCode have both been using the internal Claude Code APIs as when a user has a Claude Code subscription, the internal Claude Code API gives you tokens at a much cheaper rate than the Claude API. Presumably Anthropic makes Claude Code cheaper than the Claude API because they are willing to give users a discount for them to use Claude Code vs a competing product such as OpenCode.

It looks like until recently OpenCode tried to get around Anthropic's requirements by offering "plugins" in OpenCode that would allow users to use their Claude Code subscription in OpenCode. This PR mentions as much at[0][1]:

> There are plugins that allow you to use your Claude Pro/Max models with OpenCode. Anthropic explicitly prohibits this.

> Previous versions of OpenCode came bundled with these plugins but that is no longer the case as of 1.3.0

This PR seems to be in response to Anthropic threatening OpenCode with legal action if they keep using the internal Claude Code APIs.

  [0] https://github.com/anomalyco/opencode/pull/18186/changes#diff-b5d5affc6941bf7bb19805cc8f556cd1b9ae73ffd99e520120700536b166f8c0L310
  [1] https://github.com/anomalyco/opencode/pull/18186/changes#diff-b5d5affc6941bf7bb19805cc8f556cd1b9ae73ffd99e520120700536b166f8c0R321


Yep, well said and great, sharp explanation.

I think we can attribute a bunch of consternation here to drift between assumed and actual licensing terms.

The actual licensing terms for Claude Code expressly prohibit use of the product outside of the Claude Code harness. If you want Opus outside of CC, the API is available for your use anytime.

Some percentage of the community seems to assume their Claude Code subscription licenses allow free usage of CC across any product surface - including competing products like OpenCode. While this is a great way to save on API costs, the assumption is incorrect. In fact, it is *so* incorrect that Anthropic has encoded their licensing terms into their Terms of Service, and a result can take legal action against any violating parties.

We can have separate discussions about Anthropic’s use of the Common Crawl in pre-training, or whether foundation labs adhere to robots.txt conventions. But those don’t directly impact Anthropic’s right to bring litigation.

——

Outside of that I think angry users have their own stated preferences v revealed preferences here. They claim they want Opus on their terms, and Anthropic’s actions infringe on their user rights.

Angry folks: Opus is right there! You just need an API key! The reality is you want Opus in your devtools of choice at discounted rates. You could at least be honest about your consternation


> including competing products like OpenCode

I think that’s a bit more nuanced. The actual „product” is not the harness, which is free anyway, but the Claude subscription. In any scenario, that’s what the customer continues to pay for. I understand why Anthropic is doing that, but I feel no need to defend it. Just like I understand why Apple limits your app choices to AppStore, but I’m not going to go out of my way to defend their decision.


It's way more nuanced, because the subscription is older then Claude Code - and they only started to have a problem with third parties using it after Claude Code. (And not with the release, just some time after the release)


That makes perfect sense because that's when it became orders of magnitude more expensive to offer the service.


To me, that argument would only make sense if the subscription wasn't metered... But it is.


Sheriff spits to the ground. One harness. One horse. How we do it' fer now on.


>We can have separate discussions about Anthropic’s use of the Common Crawl in pre-training, or whether foundation labs adhere to robots.txt conventions. But those don’t directly impact Anthropic’s right to bring litigation.

Some of us don't care for Anthropic's "right to bring litigation" anymore than we care about some scumbag patent troll company doing things "within their legal rights".

We care for the morality of its conduct, the openess of its products, and the environment it creates.


I think this is disingenuous, people want to be able to use a tool that they pay for to do useful work on their own terms because they payed for it and don’t see the differential pricing model offered by Anthropic as legitimate.


Why would it not be legitimate?


[flagged]


It's very amusing to hear this particular argument being made to defend AI companies.

When the people want that, it's inconsequential.

When the corporations did that, it was their God-given right.


I don’t agree, what people want is very consequential, because those people are paying customers of a service, if they aren’t happy with it they have every right to complain.

People should be vocal about what they do and do not think is reasonable behavior by corporations and then act based on those opinions with their wallets. Lord knows we have precious few other ways of influencing corporate behavior.


No, it's important. People are allowed to discourage each other from buying a product that they consider subpar.


>What the people want is inconsequential here. The people also want to abolish copyright and freely share and download media too.

I already approved of the complaints against Anthropic here, you don't have to sell it this hard to me.

(Not to mention the blatant hypocrisy that their whole business is based on open copyright abuse - all that copyrighted training material, illegally obtained books and movies, etc).


They can't take any legal action outside of the US. In most other jurisdiction such Bullshit in the ToS would be void anyway


Claude code might be subsidized but there are other risks

Like if any agent can use claude models then it exposes them to distillation risk. Where data gathered from millions of such agent usage can easily be used to train a model, making their model superiority subpar

Second thing is, to improve their own coding model, you need predictable input.

If input to their model is all over the place (using different harnesses adds additional entropy to data) then it's hard to improve the model along 1 axis.

Cache is money saver in computing. Their own client might be lot better at caches than any other agent so they do not want to lose money yet end up with disgrunted customer that claude isn't working as good

And also, if a user can simply switch model in an agent. Then what moat does anthropic have? Claude code will not include other companys models and thus will allow them to make their claude code more "complex" with time so the workflows are ingrained in users psyche to the point using anything else becomes very difficult and user quickly returns to claude code


They are not entitled to a moat, and their customers do not owe them one. Several companies have narrow or no moats. Dell and HP are two examples when it comes to their PC business.

This idea that companies should be allowed to lock down their products just so they can have moats, is how we ended up with printer ink being more expensive than crude oil or champagne.


Companies are absolutely allowed to lock down their own products. Netflix is a great example, you don't bring your own client for Netflix.

The whining/entitlement in this thread is ridiculous. The API is always there for you to use as you desire.

If you want to use the loss leader on the other hand, you agree to abide by certain terms. But if you don't want to do that, just use the API. It's not that hard.


> champagne

Makes sense, sort of...

> crude oil

I really hope that's cheaper than ink or we're gonna have a problem...


> Cache is money saver in computing. Their own client might be lot better at caches than any other agent so they do not want to lose money yet end up with disgrunted customer that claude isn't working as good

I’d bet a reasonable amount that this could be the case. They are very well incentivized to maximize cache use when it’s basically not pay per token.


> Anthropic has two different products that are relevant here: the Claude API and Claude Code.

No, the two relevant products are Claude API vs Claude subscription. There's no "Claude Code subscription". There's just a subscription for all Claude services at once.


The $20/mo Pro subscription only allows regular chat and Claude Code and does not allow you to export your API key without reverse engineering CC. The higher tiers allows console and direct API usage.

Basically, the concept of Claude-Code having its own API tier holds.


Really? I thought API key usage was always billed per token, not via monthly allowances?


There're keys for users to access their public API with whatever they want, and there're tokens for Claude Code to access their private API.


Which one is OpenCode using?


Allows both, the second Anthropic doesn't want them too.


The private, which they shouldn't be.


Their ToS says differently. You can't argue with what's explicitly in their legal agreement.


> You can't argue with what's explicitly in their legal agreement.

Sure you can, that's what courts are for


A case like this would immediately get thrown out because it makes no sense to argue it. Can be considered frivolous.


Sure you can. TOS docs are full of non legally enforceable wishful thinking bullshit, especially when they're written by an American company providing services to me in Europe. Most of the time they just expect (correctly) that they'll never get challenged in court over it.


Even if it isn't enforceable from a usage perspective, it is from a provider perspective, meaning they can also simply deny their service to anyone they discover breaking said terms. And there's nothing anyone can do about it.


>You can't argue with what's explicitly in their legal agreement

Sure I can. I can even contest it in court (if I had the money).

Some "legal agreements", TOS, etc. are even unenforceable and blatant abuses of the law.

And what's more, I can even consider ALL such legal agreements bogus and demand that the law changes to now allow them.


> Some "legal agreements", TOS, etc. are even unenforceable and blatant abuses of the law.

Good luck trying to classify this one as such. There's no valid argument given the fact that users are attempting to gain access to an offer in a way that isn't applicable to them. It's tantamount to deception and stealing, going somewhere you were not invited as though you were and taking something that wasn't given to you.


>> There are plugins that allow you to use your Claude Pro/Max models with OpenCode. Anthropic explicitly prohibits this.

In other words, you pay $$ for a subscription, but Anthropic has the gal to tell you what client to use with it.


The part I never really understood, was I thought the subscriptions were to try and boost Opus usage, not claude code usage ? I'm not sure why they care whether you use API or claude, as they limit the number of tokens you can use anyway - and once the request hits the model, I would have thought it takes the same amount of effort to process it regardless of where it comes from ?


It’s definitely to encourage Claude code usage. Owning the interface through which your core product is delivered is a hedge against the commoditisation that everyone talks about. Eg, it’s much harder to switch from Claude code to cursor or vice versa than it is to switch between models in cursor (I sometimes don’t even notice model defaulting to composer inside cursor)


This is clearest reason for us to accustom ourselves to using open weight models on open source harnesses. Whatever advantages the frontier closed models offer, this will turn into ash in the mouth, when the enshittification cycle begins. And don't be mistaken, it will begin. There is no precedent which can claim otherwise.

I am sure the models themselves are being RLHF tuned to work very well with the proprietary agent harnesses. This is all turning into a huge trap right in front of our eyes and the target is not just programmers but also companies whose core product involves software production.


Fully agree with you


I can believe it - maybe they feel they have enough of a lead in usage with programmers with Opus that they want to locking down the tooling side as well.

edit: clarify


Good recent video on this specific subject, it was enlightening to me: https://youtu.be/3FbqaD1MCUA


So they've been advancing in making the AI use the computer through the same API as a person (screen/cursor/keystrokes), and the dream is a Future where AI can use a PC and handle tasks and tools like a human user.

But to use their product,you have to go through the non-human-friendly API route, or else it's against the rules and Anthropic will sic their legal team onto you...

Something about this reasoning seems brittle. Specially in a world of Agentic tools

Ok


this is actually the best explanation of this situation, why is it downvoted?


people like directing anger at a "bad guy"


Three consecutive months of decline starts to look more like a trend. Unless you think there's a transient issue causing the decline, something fundamental has changed


Again: compare early 2024. And that’s not the only thing; the second chart shows a possible flattening, but by no means certain yet, especially not when taken with the clear March–April jump; and the first chart shows no dwindling in 1–4, and clear recovery in 250+. The lie is easily put to the claim the article makes:

> Data from the Census Bureau and Ramp shows that AI adoption rates are starting to flatten out across all firm sizes, see charts below.

It’s flat-out nonsense, and anyone with any experience in this kind of statistics can see it.


From the chart, the percentage of companies using AI has been going down over the past couple of months

That's a massive deal because the AI companies today are valued on the assumption that they'll 10x their revenue over the next couple of years. If their revenue growth starts to slow down, their valuations will change to reflect that


This bubble phase will play out just as the previous have in tech: consolidation, most of the value creation will go to a small group of companies. Most will die, some will thrive.

Companies like Anthropic will not survive as an independent. They won't come close to having enough revenue & profit to sustain their operating costs (they're Lyft to Google or OpenAI's Uber, Anthropic will never reach the scale needed to roll over to significant profit generation). Its fair value is 1/10th or less what it's being valued at currently (yes because I say so). Anthropic's valuation will implode to reconcile that, as the market for AI does. Some larger company will scoop them up during the pain phase, once they get desperate enough to sell. When the implosion of the speculative hype is done, the real value creation will begin thereafter. Over the following two or three decades a radical amount of value will be generated by AI collectively, far beyond anything seen during this hype phase. A lot of lesser AI companies will follow the same path as Anthropic.


To be fair to OpenAI, their privacy policy[0] does provide some detail. They don't mention Mixpanel explicitly, but OpenAI does mention they share your information with third-party web analytics services:

> To assist us in meeting business operations needs and to perform certain services and functions, we may disclose Personal Data to vendors and service providers, including providers of ... web analytics services ...

OpenAI likely provides this disclosure to comply with US state privacy laws, but it's inaccurate to say they didn't disclose that they won't share your information

[0] https://openai.com/policies/privacy-policy/


Not exactly. Step E in the blog post:

> Gemini exfiltrates the data via the browser subagent: Gemini invokes a browser subagent per the prompt injection, instructing the subagent to open the dangerous URL that contains the user's credentials.

fulfills the requirements for being able to change external state


I disagree. No state "owned" by LLM changed, it only sent a request to the internet like any other.

EDIT: In other words, the LLM didn't change any state it has access to.

To stretch this further - clicking on search results changes the internal state of Google. Would you consider this ability of LLM to be state-changing? Where would you draw the line?


[EDIT]

I should have included the full C option:

Change state or communicate externally. The ability to call `cat` and then read results would "activate" the C option in my opinion.


> Also, ARC AGI reported they've been unable to independently replicate OpenAI's claimed breakthrough score from December

Can you elaborate on this? Where did ARC AGI report that? From ARC AGI[0]:

> ARC Prize Foundation was invited by OpenAI to join their “12 Days Of OpenAI.” Here, we shared the results of their first o3 model, o3-preview, on ARC-AGI. It set a new high-water mark for test-time compute, applying near-max resources to the ARC-AGI benchmark.

> We announced that o3-preview (low compute) scored 76% on ARC-AGI-1 Semi Private Eval set and was eligible for our public leaderboard. When we lifted the compute limits, o3-preview (high compute) scored 88%. This was a clear demonstration of what the model could do with unrestricted test-time resources. Both scores were verified to be state of the art.

That makes it sound like ARC AGI were the ones running the original test with o3

What they say they haven't been able to reproduce is o3-preview's performance with the production versions of o3. They attribute this to the production versions being given less compute than the versions they ran in the test

[0] https://arcprize.org/blog/analyzing-o3-with-arc-agi


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: