Can't we simple parse and remove any style="display: none;", aria-hidden="true", and tabindex="1" attributes before the text is processed and get around this trick? What am I missing?
If you do that and don't follow robots.txt, you are blocked. If you do that and follow robots.txt, fine. That's all we wanted you to do anyway. Just follow the instructions that well-behaved scrapers are meant to follow.
Just have the link visible, but css it so that its either small as hell, or just off screen. Google / bots will follow it, real peopple will never see it.
That solution can be recreated by a skilled AI boosted senior platform engineer in a few days and parity achieved in a few weeks. Nothing of value was lost.
Is "how programmers work" a useful and provable metric? No? Then it belongs in philosophy discussions. How you work and how I work is different. Your work may have ended up in the LLM training and my work did not. Or vice versa.
Can you objectively analyze how VSCode adapts to your way of working without our interference?
Did you test your theory with the actual frontier LLMs (which Kimi K2.5 is not BTW?)
I enjoy the journey too. The journey is building systems, not coding. Coding was always the most tedious and least interesting part of it. Thinking about the system, thinking about its implementation details, iterating and making it better and better. Nothing has changed with AI. My ambition grew with the technology. Now I don't waste time on simple systems. I can get to work doing what I've always thought would be impossible, or take years. I can fail faster than ever and pivot sooner.
It's the best thing to happen to systems engineering.
My experience was exactly the opposite—I came from the other side entirely.
I had absolutely no programming knowledge, and until three weeks ago, I didn’t even know what a Parquet file was.
While reviewing a deep research project I had started, I stumbled upon an inefficiency: The USDA’s phytochemical database is
publicly accessible, but it’s spread across 16 CSV files with unclear
links. I had the idea to create a single flat table, enriched with data from PubMed, ChEMBL,
and patents. Normally, a project like this would have been
completely impossible for someone
like me—the programming hurdle is far too high for me.
With Claude Opus 4.6, I was actually able to focus entirely on the problem architecture:
which data, from where, in what form, for which target audience.
Every decision about the system was mine. Claude Opus
took care of the implementation.
I’m probably the person your debate about “journey vs. destination”
wasn’t meant for. For me, the destination
was previously unattainable. My journey became possible, because
the AI took over the part that I could never have implemented anyway.
I hear everyone say "the LLM lets me focus on the broader context and architecture", but in my experience the architecture is made of the small decisions in the individual components. If I'm writing a complex system part of getting the primitives and interfaces right is experiencing the friction of using them. If code is "free" I can write a bad system because I don't experience using it, the LLM abstracts away the rough edges.
I'm working with a team that was an early adopter of LLMs and their architecture is full of unknown-unknowns that they would have thought through if they actually wrote the code themselves. There are impedance mismatches everywhere but they can just produce more code to wrap the old code. It makes the system brittle and hard-to-maintain.
It's not a new problem, I've worked at places where people made these mistakes before. But as time goes on it seems like _most_ systems will accumulate multiple layers of slop because it's increasingly cheap to just add more mud to the ball of mud.
This matches my experience when building my first real project
with Claude. The architectural decisions were entirely up to me:
I researched which data sources, schema, and enrichment logic were suitable and which to use. But I
had no way of verifying whether these decisions were actually
good (no programming knowledge) until Claude Opus had implemented them.
The feedback loop is different when you don’t
write the code yourself. You describe a system to the AI, after a few lines of code the result appears, and then you
find out whether your own mental model was actually sound. In my first attempts, it definitely wasn’t. This friction, however, proved to be useful; it just wasn’t the friction I had expected at the beginning.
Maybe it is just my experience, because I'm not a system programmer, but instead learning it. I find that concepts in system programming are not really very hard to understand (e.g. log-based file system is the one I'm reading about today), but the implementation, the actual coding, the actual weaving of the system, is most of the fun/grit. Maybe it is just me, but I find that for system programming, I have to implement every part of it, before claiming that I understand it.
So much agreed. I'm constraining my AI, that always wants to add more dependencies, create unnecessary code, broaden test to the point they become useless. I have in mind what I want it to build, and now I have workflows to make sure it does so effectively.
I also ask it a lot of questions regarding my assumptions, and so "we" (me and the AI) find better solutions that either of us could make on our own.
It's going to be faster no matter what. My M3 MAX prints tokens faster than I can read for the new MoE models. It's the prompt processing that kills it when the context grows beyond a threshold which is easy to do in the modern agentic loops.
Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.
The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.
Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.
China is certainly lax, but the US doesn't allow autonomous ATTACK systems. For Attack systems it is always required that a human makes the judgement call when to attack.
Or least it didn't until the current regime.
The US does have autonomous defensive systems.
I could be wrong though, can you post your evidence? The closest I could find is loitering munitions.
Even so, a company shouldn't be forced to go against its ethics if those ethics help humans.
Drone pilots don't get any info about their target, certainly not enough to make a judgement call. If they object (or burn out) someone else is put in the chair.
People are conscripted, they put on the uniform and become legitimate targets? It might as well be a robot doing the shooting. Same difference.
The pilot becomes responsible for those outcomes. For example indiscriminately killing civilians for example is a war crime. Its easier to get an AI to commit war crimes than humans.
Perhaps but if the difference is significant I don't know. Everything changes then we try stretch rhetoric from stabbing someone with a sword to hypersonic missiles? We might hold the pilot responsible if they erase a building but I'm far less comfortable blaming them. We know the targets are actually picked by computers using metadata. The difference gets increasingly vague.
> Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
Is the reason to ban or block free open weight models that you're worried what kids will do with them?
I'd imagine the economic case to be made is that the Western AI companies will ultimately not be able to compete with free open weight models. Additionally, open weight models will help to spread the economic gains by not letting a few monopolies capture them behind regulatory red tape.
Finally, I'd say the geopolitics angle of why open weight models are better is that if the West controls the open source software that will power it will be able to reap the benefits that soft power brings with it.
I'm going to claim that the majority of those users are optimizing for cost and not correctness and therefore the quality of data collected from those sessions is questionable. If you're working on something of consequence, you're not using those platforms. If you're a tinkerer pinching pennies, sure.
This is a weird dichotomy and I don't agree with it. You don't need to have bags of money to burn to work on serious things. You also can value correctness if you're poor.
ChatGPT, Gemini and Claude are banned in China. Chinese model providers are getting absolutely massive amounts of very valuable user feedback from users in China.
Their models are specifically trained for their tools. For example the `apply_patch` tool. You would think it's just another file editing tool, but its unique diff format is trained into their models. It also works better than the generic file editing tools implemented in other clients. I can also confirm their compaction is best in class. I've imlemented my own client using their API and gpt-5.2 can work for hours and process millions of input tokens very effectively.
reply