Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Is it just me or is Claude Code getting worse?
29 points by e-nouri 2 days ago | hide | past | favorite | 20 comments
Is it just me or is Claude Code getting worst and worst, since they introduced the 1 million context on the 4.6, things start to go real bad, am I the only one ? PS: I am still paying the 200 euros monthly max
 help



Claude the model is still insanely great IF (and perhaps, ONLY IF) you are willing to fork over the money for the API and use a harness like OpenCode.

Claude Code itself is complete trash. They had a massive headstart and now are routinely lapped by open source harnesses and then they STILL double down on not allowing e.g. OpenCode usage with the Max plan. Meanwhile, OpenAI lets you use whatever harness you want and its a beast. I recently did some testing and OpenAI's Pro plan on an opencode harness (GPT 5.5 XHigh) with parallel agent delegation absolutely smokes Claude Code 4.7 Max. These days Claude Code can barely even remember its CLAUDE.MD instructions. I'd say Opus 4.7 Max API is slightly better than GPT 5.5 XHigh, but not nearly enough that the API token price is at all justified.

Claude, I think is still better for business things like document generation, design, etc. especially via claude.ai interface (GDrive integrations and things like that are very useful). But for code generation and dev workflows, Claude Code is dropping the ball so hard its starting to look like a generational fumble.


Why do you feel the OpenCode harness is better than the Claude code one? Just curious what you feel it is doing better?

OpenCode harness is so good I'm surprised one of the big players hasn't bought them outright. Essentially their harness:

* Removes all the system prompt cruft and bullshit that CC pumps into the prompt and pollutes context and shit like "adaptive thinking"

* Is extremely good at keeping the model aligned with AGENTS.MD and opencode.json and using all the features available there (parallel agents, sub-sub agents, etc)

For example, I'm working on a repo with 5 distinct components and I have a specialized agent for each component. CLAUDE.MD is just a markdown file where I say "Hey Claude always use X agent for X component. X agent has this prompt blah blah" and then pray Claude remembers to use it. opencode.json is a structured file used by the harness and it has ALWAYS coerced the model to use it, including being able for the agents to delegate subagents in parallel etc.

This makes a massive difference. So if I have a feature that touches multiple components, OpenCode rips through it with the specialized subagents while Claude sits their spinning its wheels and occasionally remembering theres a specialized agent and then maybe once in a blue moon it will do it in parallel.

With CC I feel like I need to do all these invocations and coercions. OpenCode, once you've got your opencode.json and agents defined, just works.


Is there a guide you can link for opencode usage like this? I just use codex and find its generally really good. What you are describing sounds like a bit of an unlock.

For sure. Here's a quickstart guide for you.

First things first get opencode installed https://opencode.ai/ and connected to your provider (works with almost everything except Claude Max).

Then the workflow is like this for each repo (both greenfield and existing):

1. Create an opencode.json in the repo root. opencode.json is the harness config. It tells the system what provider/model endpoint to use, which instruction files to load, what the default agent is, what specialist agents exist, and which slash commands route to which agent/workflow. For now, it can be very simple and barebones.

2. If you have any existing CLAUDE.MD or AGENTS.MD files you like to use, you can point them in opencode.json as "instructions" key. So here's a sample config (or look at https://opencode.ai/docs/config/)

  {
    "$schema": "https://opencode.ai/config.json",
    "instructions": ["AGENTS.md"],
    "default_agent": "build",
    "agent": {
      "coordinator": {
        "description": "Lead coordinator",
        "prompt": "This will be the prompt for your coordinator agent",
        "agents": [
          "componentA_build",
          "componentA_meta"
        ]
      },
..."

3 [the most crucial step]. Create a plan of how the repo is structured to feed back into the harness so it can generate its own tailored config.

Of course, this is where your own critical thinking comes in. You're going to be prompting the default agent to start extending this opencode.json to fit your project. I found that most models are relatively poor at sussing out the "why" of an existing codebase - they are much much more focused on the how and what.

So much like how companies ship their org chart, with gen AI code you ship your architecture. If you don't have a good mental model of how the game should work (not that it HAS to be working that way now, but how it should be), then you won't go anywhere. If you can provide the "why", that is the connective tissue that really makes the difference.

For instance, lets say you are working on a railroad simulation game with a modular architecture - there's one central "GameController" which references submodules like "CityController", "TrainController" and so on. You'd then want to create agents that specialize in things like cities, trains, railroad, building, etc with a high level "coordinator" agent that has access to every subagent (as defined in the "agents" key). And then you get to the fun part of making higher-order agents - so an agent that specializes in game balance that has CityAgent, TrainAgent, GameAgent etc as sub agents, "RouteAgent" that has RailroadAgent, TrainAgent etc that specializes in efficient routing calculations and so on.

I found that in this step it helps just to braindump into a scratch pad and just write out as much as you can at a high level how it all operates and the architecture and the agents you want. Having a defined OUTPUT is the most important part of the agentic process and prompting. Key things I have in this braindump:

a. Architecture overview, both "actual" and "idealized" versions (what do we want the end game to look like e.g. a full-fledged train sim with stock market and city building - even if currently its just an isometric map with some buttons on it).

b. Core components of the system

c. Key file locations, any workflows like processing images, audio etc for game

d. Any design decisions and whatnot you made that aren't captured in code or documented (e.g. "I'm basing the economy off of XY Game")

One thing that's VERY helpful at this phase is having the default agents ("Build" and "Plan" with OpenCode) go through and document all the code and as much as it can into a docs folder if you don't have that already.

4. Take that braindump and feed it back in OpenCode and tell it to modify its opencode.json to have agents to handle all the components and architecture. Tell it to parallelize and delegate as much as possible.

5. It'll output a new opencode.json. Restart opencode and go wild.

As you work with the harness, you'll get the nuances of how the agents are interacting and what needs some tweaking but the key here is to always keep the feedback loop going. Tell the agents to always update docs after committing code and to always read the docs before doing anything etc etc. This is key to making sure the agents don't go off-rails.

Eventually you'll see the meta-patterns you like and create scripts to e.g. autogenerate this kind of harness for any repo you encounter. I don't have one "source of truth" opencode.json but rather a base template and then a python script that does all of the above automagically for my workflows.

The key insight here, I think, is like learning Lisp. The harness IS code, agents ARE Code - they can be modified as needed dynamically and adapted. They are first-class citizens and can be composited like functions or chained together like graphs. The map is the territory. Once I realized that I could prompt the harness to modify itself and do all the things to itself that it does to the codebase, things really took off for me.

Anecdata disclaimer: This workflow might not fit everyone's mental coding model. Adapt as needed. I use literally dozens of these kind of harness configs every day at work, including meta-harnesses with code review of harnesses and EvalOps. Been doing so for about six months now. I'll also note we are very serious about performance and feature degradation but with this approach we've had less rollbacks and regression than in the previous five years.


tysm for taking the time to write this in detail

Thank you I appreciate the explanation.

In what way is it getting worse — the model's reasoning, or your own setup drifting underneath you?

My observation: same model, same task, but different CLAUDE.md / hooks / skill state produces dramatically different outputs.

The hard part as a solo founder is finding the right balance between building the meta tools and making progress on the actual projects, without the meta layer drifting and quietly becoming worse over time.


I second to that.

with some small caveat: reporails can help with the drifting detection a lot.


My issue was with different models. Same Claude.md, hooks skills and the rest. Same task. Almost two months ago I used CC to bootstrap a headless MacBook Air with nix-darwin for management. I configured it to be an iMessage bridge as well as secondary dns for my network. All in it took about an hour.

Last night same task for the same context but newer opus model. This time a freshly installed MBP that I wanted nix-darwin setup on to keep my tools / console config in sync across systems. To start with, it was trying to install some proprietary nix version and couldn’t fix a broken ssh terminal issue at all despite having a working example literally sitting right next to it with my working mba config.

Latest version of CC feels lobotomized.


Fair point. I don't see it consistently myself, but there are def moments where it just stops parsing context on simple tasks/stops thinking. Not sure if it's model regression, context rot, or just prompts getting routed to a different/wrong expert in the MoE for a stretch — dunno.

Due to the copilot nerfing recently I've switched to codex and gpt 5.4 (and now especially 5.5) have been doing pretty great.

But even codex has these super weird time limits. It's really starting to show that these companies must have been losing a ton of money with all the recent limits and degration.

I'm still on the "camp" that most of these unicorns will be F'ed by open and local models in the next few years, at least in these coding/chatbox niches and then they'll just be perpetually (re)searching for AGI :shrug:


I've switched to OpenCode and I think it's really great.

The first advantage is that it can use many different models, not just Claude. The second is that I've noticed OpenCode streams out the entire reasoning process, while Claude Code doesn't show streaming of its thinking.


can't say it become worse, but at some point it stops be so useful as it was before - it looks like magic disapear

You are not crazy, Anthropic makes claude dumber everyday. Although, in a last week or so, token consumption and model intelligence improved for me.

It's not just you. It's getting much worse. There is a lot of talk on X about it, along with hypotheses and evidence-based testing.

Can you provide any links to discussion?

I can't say which is better, but 4.6 was the most intense

Yeah buddy, Claude Code is honestly getting worse lately. It's been giving me buggy or incorrect code for my projects as well

I keep telling people but no one will listen to me that these things are not sustainable and no productivity can be gotten from using these tools. You must compose your context and consume every bit as much as you are able. These agents and other things operate like a casino throwing tokens sometimes getting it right, but you will not make any meaningful progress unless you learn to control context and snowball the conversation. It's a more complex iterative process for which there is no subject. This is far more advanced level of programming allowing us to make bigger systems less complex. There's unusual ways of using this, yes. We're looking at pricing well above $10,000 / year, call me crazy you will until you'll suddenly stop when you realize I was right. There's only one way, total context control and simple interface. I had to create a simple interface because the tool I was using released an updated and I couldn't wait. So you'll all end up using something similar to what I made, with ChatGPT, to then use the API directly. Combined with VS Code makes for very easy, natural way of consuming tokens and generally work with this. You can just assume file is prompt.md, that you have such file in every directory where you intend to execute the command and make it available at path.

When you ALL were paying for subscription I was paying for API costing me much less than subscription, being less stressed, knowing I don't have to worry about fog of context.

I see it now, it's not sustainable. They've signed contracts they can't get out from and we're gonna have to pay, with blood, gold, or in this case, quality.

You will pay, you (we) will all pay for their debts.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: