Hacker Newsnew | past | comments | ask | show | jobs | submit | Planktonne's commentslogin

I think it would be slightly different if you built and advertised a viewing gallery for that purpose.

> I thought this would be simple

Rationalising global location data across several hundred years based on extracting real-world references from complex and metaphor-laden text.

Every single part of that should trigger a 'definitely complicated' warning bell.


> The prompt is not what you meant to say. It is part of the private act of thinking

The private act of thinking does not involve a round trip to a corporation's servers; that would make the whole term meaningless.


> The private act of thinking does not involve a round trip to a corporation's servers; that would make the whole term meaningless.

Firstly, this is just wrong. People can and do search stuff online as they're writing something up.

Second, your distinction is the one that's meaningless. The LLM could be running locally on a private machine...


1. Yes, but the research isn't part of the 'private act of thinking'; it's an input to it.

2. It could; it basically never is though, so let's deal with the general case.


You’re missing the point. OK, it’s not ‘thinking’ in a pure sense; the point is it’s private notes. If someone says something to you, they are allowed to privately prepare it first (thinking, jotted notes, LLM rewriting, running it past lawyers, whatever) and you don’t get to see all that. That’s how it has always been.

Just because someone says/writes something to you doesn’t mean you have any moral right to see all the things they considered and then decided not to say.


> The LLM generated writing obviously felt significantly better than my own writing.

A general pattern for LLMs is that they look really good at things you are bad at. What that means is that if you find yourself thinking of its output as significantly better than yours in a particular domain, there's a high chance that you are not equipped to judge that quality effectively.


> A general pattern for LLMs is that they look really good at things you are bad at.

This is true for coding, too, which I think, to a large degree, might explain the polarized differences in opinions on HN about the quality of LLM-produced code. You have the 1. "AI produces code better than I could possibly write, one shots things it would take me days to do, and has made me 10X more productive!" camp, and you have the 2. "AI constantly produces poor code needing rework, makes mistakes, has to be babysat, and ultimately costs me time!" camp, with a spectrum in between those. How could the output of the same product be seen so differently? Well, I have bad news for camp 1...


I've caught Claude Code generating some pretty egregious security vulnerabilities. I'm using it to build an AI RPG site and the goal is to use web assembly as a bridge between author submitted code and LLMs in order to help shore up state management at the game level.

The language that I picked for the game runtime is Python. Claude really thought that the best way to validate user submitted Python was to bypass the WASM sandbox and execute it within the application container using shell exec - essentially opening up an RCE vulnerability.

I also find that the quality of Claude Code degrades substantially. Claude really wants to implement every feature in as bespoke way as possible. This is fine when you first generate the project but over time you'll find that every web modal is implemented differently. Every button is different. Business logic is disconnected. It's why agentically produced codebases are MUCH larger than they should be; every feature is developed in a vacuum.

Then I'm trying to shove stuff in my AGENTS.md or CLAUDE.md files like "ALWAYS look for existing patterns within the codebase to keep it consistent." But the harness doesn't always work and it'll generate useless, verbose code anyways.

In some cases it's useful - like if I am shaky on the DSA knowledge needed for a specific operation or optimization then Claude can replace Stackoverflow. But, man, I'm so frustrated with it.


> Business logic is disconnected. It's why agentically produced codebases are MUCH larger than they should be; every feature is developed in a vacuum.

I just had Opus 4.7 build a feature twice, because it didn't close a ticket the first time. (I'm trying to solo-build a fairly large greenfield project, and am at the point where I let it go ham over my codebase because of the scale of things.)

I then spent a couple of hours asking it to compare the features. It argued that they were completely different features for a while, then eventually acquiesced and said that they were redundant.

That's a couple of thousands of tokens and time I'm never going to get back.


It acquiesed for now, soon this knowledge will slip out of its context window

> I'm using it to build an AI RPG site

I am trying something similar: basically a LLM managed MUD for people with too little time for roleplaying.

If you want to chat, hit me up at mail-from-ohsohumble@f12n.de


I think there are some factors beyond just skill too - the kinds of tasks you're giving the AI, and how involved you are in ensuring the output is good (via either extensive planning guidance, extensive review/testing, or a combination).

I came up with an analogy the other day -

The 'camp 1' people in the pre-LLM days were probably the ones that often just copy+pasted code from SO they didn't really understand, but since the code seemed to work when they ran it, they thought it was all fine and continued on.

Whereas the 'camp 2' people when trying to find an answer to something, discarded 99% of SO and other similar answers, having the knowledge to see how they had broken edge cases, were limited in some way, didn't actually solve the underlying problem, etc.

Nowadays, 'camp 1' people just use the LLM output and it "seems to work" and consider it all fine. Whereas the 'camp 2' people still continuously see all the faults with it.


Being faster than humans at mundane and verifiable tasks is a useful thing. Great for format conversions. Api mappings etc. if you don't understand the algorithm you are asking it to implement, you better at least understand how to generate a large set of correct input and output pairs yourself, because it will absolutely make stuff up and adjust the test cases to pass.

I'm in camp 3, where sometimes I don't really care how good or bad the code is. For internal tools for example, you can let the LLM crunch out code really fast, you can validate output but don't even have to look at the code. These kind of "weekend projects" can get finished in an hour or two, and so are really 10x.

For bigger production ready code, you indeed have to guard the architecture. But for the code, in some corners you can get away with sloppy code, as long as it kind of works.

What I'm saying is, code doesn't always has to be great. You will just have to judge the places where it needs to be high quality, and other places where you can get away with sloppy code.


That judgment is an essential skill of an experienced programmer, and it is required at every level of the big picture, from high level architecture decisions to the development of particular features: what should I polish and what needs to be developed fast? How exactly should I cut corners in the safest way?

So there are still only two camps


LLMs can generate code, but the quality of the code at scale is just not there currently by all important metrics such as security, maintainability, separation of concerns, etc.

Today, it's a kind of chaos magic wherein you summon the beast and try your best to contain him, knowing that someone will probably die in the process. Sometimes literally. It's still a force multiplier in the right hands and domain, and agentic coding is a paradigm that won't retract, at least until something better supplants it.

The problem is that few engineers actually have the discipline available to constrain these models appropriately and instead rely on a hodgepodge network of "skills" aka prompt fragments which are passed around and glued together.

I consider myself as having such discipline, being strongly architecturally-minded, user-first, etc. in both design and implementation. And I still struggle to contain the beast many days. I just got through screaming at Claude for intentionally taking a shortcut that I'd forbidden, leading to a ton of wasted time and tokens.

Sometimes I feel like I saved weeks of R&D with a single ten-minute task handed off to an agent, other times I feel like I'd get better returns playing slots in Vegas at the alarming rate Claude burns through money.


I'm tired as fuck of anti-ai zealots pretending like every human is a fucking paragon at programming. I've literally never seen Claude Code produce as bad of code as generated by humans. Literally never. Yet the anti-ai zealots pretend like humans never introduce a bug into a system. Only LLMs produce slop or take shortcuts or ignore tests or do incredibly dumb fucking shit. It's fucking ridiculous. As if The Daily WTF didn't exist before LLMs. The reality is the "average" programmer is far below the skill floor of Claude Code or other frontier models. Those models will write test and explore more edge cases than the "average" developer ever will. But all these zealots pretend like they have only ever worked with the top 1% of the top 1% who never make mistakes or introduce bugs. Ultimately they are full of shit. You're lucky as fuck if your developers can even tell you what common design patterns are. The bar is that low and the HN crowd likes to pretend every developer is Linus Torvalds and not a clueless moron desperately coordinating API layers.

Is this comment directed at me? Where did you get an anti-AI sentiment from my comment?

I’d probably word it differently but I agree with much of the sentiment here. I’m also reminded of the stat where 93% of drivers rated themselves as above average.

I used LLM to teach me how to code and get through obstacles that would have me spending a lot of time doing ???. Typically, I just write code that I know a lot of time is absolutely wrong but the LLM helpfully point out mistakes.

I am slowly doing more of my own code and cutting out the LLM out of the loop in the unfamiliar territory I am working in.

My main concern is not so much productivity but understanding the code I have written and feeling agency over it.

The LLM is a very good teacher.


Well, I have bad news for camp 1..

It's bad if they work in a part of the industry where code quality or efficiency matters. That's maybe 10% of the total though.


I think it matters everywhere -- just because some fields get away with making trash doesn't mean that they're not vulnerable to people taking their lunch by making something distinctly-not-trash. People put up with a lot when there's lock-in, but there's a breaking point. (I say now using a linux desktop about 90% of the time now because windows has become such a fucking disaster)

Being vulnerable makes money unfortunately. And making money now has always been seen as more important than being sustainable in the long-term. Even if an exploit later takes away every cent of earnings.

Companies that don't care about code quality always care about the side effects of poor code quality. They just can't connect the dots.

I see this sentiment occasionally brought up, and at the same time see what’s happening to Github where the majority of their distributions is not security or efficiency related (not saying it’s because of LLMs, we don’t know). The point is, these things matter beyond beautiful code. You loose trust and you lose customers and money.

Are you seriously implying that technical debt is something that doesn't exist or something that managers don't care about??

Yup, pretty much.

The hard part too is it's not like you can just learn the basics and be able to tell good code apart from bad -- the more you learn to code, the more intricate your understanding of good code is. It's like becoming a good writer; just knowing grammar and spelling doesn't make your writing interesting. Not to mention that there's just a lot of bad advice out there that you can't recognize as bad advice if you're not a regular practitioner. Like, "Clean Code" is IMO a terrible book, but a ton of people follow it because it has the sheen of respectability.. until, hopefully, they learn some new patterns and realize those old ones aren't very good. But you pick these things up with experience and doing the work! Otherwise if you're just reading other peoples opinions, you'll see a bunch of people say "Clean Code is great" and a bunch of other people say it's rubbish, and you'll have no way to know who you should listen to. (If you disagree with me on Clean Code the book that's fine -- I'm just using it to make a point -- sub in a different book/ideology if it suits you)

I think looking at an LLM code and thinking you're now a coder is like watching a someone play guitar and think you can just pick up a guitar and play a song. The truth is, if you want to be good, you have to do the work.

One of the things I hate about AI is that we're going to have a generation of "programmers" that are absolutely shit at programming, create problems for everyone else, and will have absolutely no idea how bad they are. And they'll probably never get better, because you can't get better by just asking claude to do shit for you. And then the LLMs themselves will probably start to degrade because they'll be trained on the slop since it'll heavily outnumber handwritten code..


>I think looking at an LLM code and thinking you're now a coder is like watching a someone play guitar and think you can just pick up a guitar and play a song. The truth is, if you want to be good, you have to do the work.

So many posts here on HN claiming they created another useful tool with AI.

No, you didn't create it. AI did. You only had a supporting role. You're Ringo Starr and the AI is John Lennon.


> No, you didn't create it. AI did.

No, AI assembled bits of code written by hundreds of programmers before you.


> No, you didn't create it. AI did. You only had a supporting role. You're Ringo Starr and the AI is John Lennon.

It's even worse than that: you're the Ringo Starr and the AI is a a "John Lennon" who sucks and is boring and uncreative.


I disagree this is the source of the polarization. Maybe it's part of it.

I have been coding since about 1983 or so. I shipped high quality products that have been used by millions of people. From embedded software to desktop applications to distributed systems.

I don't think I'm in the "don't understand what code should look like camp" (I mean you never know but the evidence seems to show that I do know what I'm doing). I use AI as a tool and it helps me be more productive. I don't "one shot things that would take me days to do". I use it to help me automate things that I could do manually where it is faster and more effective. I review every step and if I don't like something I adjust. There are some specific situations where it basically does as good a job as I would do in running some experiments, doing some analysis or writing some small amount of code. I still know what the changes need to look like broadly, where to make them, and what patterns to follow. It just automates the work and sometimes does have some additional insight that can complement my views. Unlike me it is all knowing about everything in terms of access to "knowledge". It knows all the details of how a certain runtime manages memory, Linux internals and various open source software. I could go look it up myself (which I'd do before AI) but I don't hold it all in my head like AI basically does. It is also "all knowing" in the code base I work in (more so than me, it's a huge code base, I have an outline and a high level picture in my head but not every single code line) where again I can dive into the code but it helps me extract the relevant information faster.

I think the polarization is more on the how you use the tool, what situations you use the tool for, which domain are you operating in (languages, applications etc.). You can also one-shot simple tools and helpers that are not the production software which is another way to accelerate your workflow.


I’ve seen people coding for 4 decades, thinking the same as you about themselves, and were bad coders. Unfortunately, nobody can tell you whether you’re good or bad without seeing your code. Your claims means nothing on the internet.

What about the buisness side of things that does not care for shiny code, but shipping things to make money?

That simple arcade game (without in game transactions) needs to be fun, that website that needs to attract visitors (but not sell them anything or handle sensitive data)?

They don't care about abstract code quality, they care if it works and useful.

So a good coder here means he or she could get to working results according to what the client wants fast. And those things likely make up the vast majority of written code. So no wonder AI gets adopted as it is a powerful tool here to be even faster.

Not all code runs in airplanes, handles financial transaction or sensitive user data - for this you need the best code possible and nothing vibe coded or quick and dirty hacks.

And oh wonder, it is possible to combine both. Because yes, websites often include financial transactions nowdays, but that part can and should be handled with care. People who move slow and check things. And then those who are quick to build things on top of it.

But I strongly object to dividing programmers absolutely in good and bad programmers, when the field is so big and the requirements not the same.

Some optimize in speed, some in quality. And yes, some are just bad in both. And some can do both - but they are very rare, in my experience.


> And some can do both - but they are very rare, in my experience.

As far as I know basically all of the successful software companies had these quite early. Of course, you need other kinds of people too. And not everybody needs to be like that. But you absolutely need those kind of people.

But if you give me a few examples where this was not the case, and not recently, or during the dot com boom, where hype overwrote everything, then I’ll change my mind.


I'm just trying to provide useful context. I can claim anything and you don't know me either way. Last I checked other than your peers (which you can imagine I've had) there is no subjective "stamp" for good vs. bad. Many people who have shipped nothing and have no experience think they're great coders as well.

That's why they mentioned that their software was successful, it wasn't intended as idle bragging.

That can still easily mean that they didn’t give any value or minuscule amount of value to the project which caused their success.

The most successful projects which I’ve seen closely, all of them had only a few people who mattered, everybody else could be replaced at any given time basically, without a real impact. All of the failing ones were those in which those people didn’t exist, or were too few of them. This is exponentially more important in early phases of projects.


fwiw I wrote key features/pieces of pretty much everything I worked on. I was more or less the technical lead (and later engineering manager) on all the software I shipped.

It seems people really want to not believe AI can be useful for strong developers. That's fine. I don't really have a bone in this game, I'm anonymous here, and people can think whatever they choose to ;)


Anonymity is a two way thing. Maybe, I don't need to use "more or less". Maybe, I'm an ultra heavy user of AI. It's even possible that I'm magnitudes better developer in every single aspects.

Maybe not.

I'm just waiting for somebody who send me code, which was generated by AI, not overwritten almost completely, and it's not shit. The funny thing is that some people here were so convinced that AI is great, that they recorded how they work with AI. And two things:

- They were slower than manual copy pasting

- They still somehow introduced bugs, and very suboptimal solutions...

Also, it would be good, that anybody could show me anything, that shows, how people became not terrible with code review suddenly. Because before AI, it was a common knowledge, that almost everybody was bad with it, and people rarely did it properly, because it was considered annoying, and not because they were useless. There were jokes about rewriting things, exactly because of the same reasons, and they heavily based on reality. And suddenly, we pretend that this changed.

And somehow you should really would need to convince me that the 100s of thousands of lines of code which I generated with AI in the past years, somehow, it's better than what it is. But I'm sure, that it's easier to assume that I didn't try something, than showing only once what "good" means in this case. Unfortunately, there is nothing similar here, than for example "Groovy is a great programming language", which is a dead giveaway that whoever said that is not just bad developer, but somebody who I would fire immediately from every single project to which I'm related to. Especially if they are tech lead. But there are such people, and some of them would claim the same thing as you. // Obviously interns and juniors can think whatever they want. They are labeled as such, because they cannot know yet.


What I do at work is (obviously) not something I can share.

Groovy ;) You must love Jenkins. At least I can rest at ease you won't fire me for that infraction. I used C and C++ most of my career and more recently Go (which I really loved when it was created but my take is a bit more nuanced these days after seeing a really large code base evolve over years).

I'm confused though. You generated 100s of thousands of lines of code using AI and you think it's crap? The code AI generates for me is not some pinnacle of software engineering. It is repeating existing patterns or fairly simple concepts. I treat AI like an quick intern that scales infinitely (or a junior developer). And yes, juniors and interns don't write the best code but in many organizations there is still a fair amount of code written by them.

The thing is that in a large team/project (and the one I'm on has hundreds of developers of various skill levels) there's an endless backlog of things that can be improved including relatively easy features or refactoring. The constraints are either organizational or time. AI enables these things to get done with very little overhead so that's a net positive. It moves the needle for how much time/effort does it take to address "not that hard" issues and with proper prompting and examples it does a decent job. The bar isn't code that John Carmack would write in a week, the bar is improving a certain crappy area of the code to be more reliable or more performant or a little bit cleaner. This is life for most software projects. Yes, in a perfect world every software project is perfection. And maybe some organizations are able to approximate that.


> I really loved when it was created but my take is a bit more nuanced these days after seeing a really large code base evolve over years

At least, I can be sure that we are not near the same level. But at least, you hopefully will recognize the same thing with new languages… without seeing them failing first.

> You generated 100s of thousands of lines of code using AI and you think it's crap?

This is a funny question. First of all, there are people whose job is to test LLMs. However, I’m not one of them. I simply tried them, generated, and still generate a ton of code with them, then I rewrite basically every single line of them. Because they use for example outdated patterns, which causes the same problems what you’ve seen with Go.

> It is repeating existing patterns or fairly simple concepts.

Yes, and most of the most popular ones are mediocre the best. Average code from which LLMs are learning are made by beginners, not experienced ones, because their sheer number. So LLMs will use those.

> I treat AI like an quick intern

This is always the funniest sentence regarding this. Before AI, it was quite well known that you don’t ever allow interns near important parts of the code. Now, people who supposed to know this, and the reasons for this, somehow forgot this aspect also, just like the review thing.

> AI enables these things to get done with very little overhead so that's a net positive.

No, it does allow to tick a ticket in Jira. And if you handle this in any other way, then you will fail miserably, as how for example Microsoft quite openly did with this.

> a little bit cleaner

Ah yes, the infamous “cleaner”, about which the exact opposite is quite well known, and it’s quite obviously not true with every single vibe coded projects, without exceptions. If that’s cleaner in any environment, then I have a bad news: you’ve never worked with even medior developers, ever. Seriously, that code quality, especially architecturally, is junior level shit.

My previous boss did these low hanging fruits, he at least would never tell anything more than “it’s better than nothing”. And only regarding non-important code, which can fail without real consequences. And can be shit, obviously. The whole point was that even shit is better than nothing. Not that it’s acceptable quality in any way.

At least, you were obvious at least, that your “success” is magnitudes different, than mine. And not regarding code quality, but when a project/product successful. I completely forgot that I’ve met people who sold that their teams completed the most tickets at a company in a given time frame as success. Probably we are closer than this, but still very far away.


Do you have an example of a large scale open source project where you would consider that entire code base to be high quality to your standards?

I think you're saying "my bar is so high you even can't understand where it is". I've worked with hundreds if not thousands of software developers, in many companies from startups to well established ones, including producing products that are what I'd call critical infrastructure that work reliably and do what they're supposed to do. I think I have a pretty decent idea of what an "average" software developer looks like and the overall shape of that curve, and similarly the architecture/design curve of various real world projects. I've built software on my own as a team of one and I've worked with teams of more than 100 people. Anyways, if your assertion is what I wrote above then clearly LLMs can't replace the mythical programmer that you are. But that's not what they're aiming to replace. As to "vibe coded projects" I already said that's not how I use LLMs and I agree that can easily end up like a pile of garbage (but still has its place in the new ecosystem).

The only real test of software is whether it does what it's supposed to do: reliably, is maintainable, can be extended and evolve without losing these attributes. If you've shipped systems that are used by many, work well, can evolve to support new features etc. - kudos to you.


It really depends. If you're cranking out prototypes or testing ideas, it's genuinely great. But if you're familiar with the code it's very easy to spot its (many) mistakes. It's Gell-Mann amnesia.

Then again, I just caught Claude writing setTransparent(!opaque == false), opaque being a bool, on a purely vibecoded project. Which was pretty impressive. ("• You're right, that's nonsense.")


Is this really a split that exists?

In my case I see Claude produce code much worse than I would, but it's certainly much quicker and, even after reworking, it makes me finish tasks in less time.


It isnt though.

The industry largely has selected for camp 1 long ago.

If you don't get immediate negative feedback camp 1 can go quite a ways before problems surface.


Camp 1 is winning because we did it. We built an artificial brain. Frontier models can think, reason, and produce code better than the average human programmer. (You have not met many actual programmers slaving away on enterprise code bases if you do not understand that this is the case; the self-selecting HN crowd does not represent the profession as a whole by a long shot.) It's just a matter of, how committed are you (is your organization) to really learning and leveraging the tools?

I think this is a straw man.

> the polarized differences in opinions on HN about the quality of LLM-produced code

Are there strong differences of opinion about the quality? I've seen very few people claim that LLMs write better code than they do.

> one shots things it would take me days to do, and has made me 10X more productive

This is an entirely different claim from the former, and you're conflating them.

The boost from LLM-assisted code isn't _expertise_, it's the power of having an always-on team of reasonable junior developers from every discipline you can possibly imagine willing to do your whim.

Take for example Jesse Vincent / obra[0], who is an exceptional developer, with great taste, and a stack of well-received open-source software to his name. He posts a lot on how he's being made more productive by AI-assisted development. Do you have bad news for him about the quality of his work...?

0: https://en.wikipedia.org/wiki/Jesse_Vincent


There's a third camp between these extremes who is like "goddamn it just type this shit out for me so I don't have to do it myself".

Definitely the space I'm in, where I know exactly what I need, exactly how to implement it, just saves time typing 100s of lines.

If you’re typing 100s of lines, you’re already doing it wrong. My most used operation is completion, just before copy-paste. The reason I like vim is that it makes such operation so fast. And the reason I like emacs is that it has superpowered version of those operations.

Yes, the third camp and probably the most effective is to do a decent amount of writing yourself and use the LLMs as codegen machines, but where the DSL is natural language. Deepseek v4 flash is an incredible model for this, you can actually get into flow state as you write code and then delegate boring code to the magic autocompletion machine to autocomplete.

The better workflow, and I think the one adopted by people in the second group, is to take a step back from coding, do a bit of thinking about the domain, design a better abstraction for the problem (architecture, data structure, algorithms), and then write the small amount of code you probably need.

Code should grow according to need, not for its own sake. Start small, use it in the real world, and then improve it.


I agree with that, but there still is some code that eventually needs to be written and there is a subset of that code which can be generated. I think it depends on the domain as well - for example, UI code is trivially generatable by LLMs.

RAD tools like Delphi, qt creator, glade, Android Studio, Xcode’s Interface builder,… have always make it trivial to generate UI code. And there’s libraries for other concerns.

The majority of a project code are written at the beginning or when a major feature is introduced. The daily work is mostly tweaking. And you can’t tweak without a good understanding of the module.


The difference between prose, art and code is that we can define "good code" with deterministic tools. Not perfectly, but to a large degree.

I, for one, welcome our new AI overlords. They provide me with only the finest Gell-Mann amnesia, straight from the tap.

Eric S. Raymond has basically stopped writing code by hand altogether. He consistently delivers high quality code without intervening to fix the LLM's output himself, much faster than he would have been able to alone. This is very bad news for camp 2 because it means one of three things:

1) he is extraordinarily lucky

2) he is extraordinary brilliant at manipulating LLMs

3) you really are "holding it wrong" and you are hobbling yourself with your failure to properly learn the tools

The first two seem rather unlikely.


I'm very confused by this logic. Why should I care about his output compared to what I observe from the larger group if he's not an outlier, and if he is an outlier, why would the second one be unlikely? The only way I can make sense of this is if you're claiming that he's both an exceptional coder and that skill in coding by hand is completely uncorrelated to skill in using LLMs to code, and it's not clear to me why that would be more likely than either or both of those being false.

"Better and faster than ESR" is not a particularly high bar to clear.

I'd need to see some transcripts of his conversations with coding agents, to believe this.

4) ESR is in the first group (most likely option)

If ESR is consistently delivering high quality code now, it would be a first.

Being nerd-famous does not mean one is a good coder.

I partly write words for a living. Claude is really really bad at writing prose that doesn’t make me want to vomit.

I rarely write code, and only once for a living. But I feel like I’m a superhuman and one step away from being a zillionaire when Claude gives me a bunch of code it has written in seconds. I WILL CHANGE THE WORLD!!

And then I remember that Claude can’t write words that don’t make me want to break things and I’m good at writing words but bad at writing code.

So then I delete the code and go back to doing more profitable things than being the next zuckerfuck.


I’ll take “what is Dunning–Kruger” for $200 Alex.

I don’t understand this comment. It might be a pop-culture reference that has gone over my head.

But I think (I think?!) you might have misinterpreted ny comment, because I’m saying “yes, I Dunning-Kruger myself regularly with AI! BUT then I wake up from the psychosis and realise what an idiot I am and go back to my normal life”.

AI makes everyone think they are brilliant. The skill is recognising when you’re just another idiot.


I don't disagree about the probability, but the current frontier models are not completely useless for writing even in areas where I have significant knowledge. I would not have said that a year ago. You have to watch them like a hawk -- they are good at spitting out plausible sounding nonsense that is hard even for an expert to discern. But the dice roll going on behind the scenes is continually more biased towards being correct/useful than not.

On factual things, potentially. But if I want to read your writing, wouldn't I be trying to pick your brain? Otherwise why don't I read wikipedia or usage documentation?

Honestly, I can't fathom thinking that LLM writing is even remotely passable. People that think this should honestly read more. One book a month is hardly an aspirational goal. You don't even have to read Melville or Hemingway or Chaucer or Shakespeare, just pick up any popular NYT best seller, and it'll be significantly better than anything an LLM can generate.

I haven't used these things for writing recreationally for a while (since the Claude 3.X days), so my opinions might be outdated - but they definitely weren't bad - after all they had a huge library of witticisms to pull from, and like Stable Diffusion that pulls from master artists, so do LLMs from skilled writers. Pro writers did come up with an absolute dearth of interesting ideas, and there are mountains of skillfully written prose out there - and its all in the training data, and AI is quite good at pulling from it.

The advantage of the writing vs images, is that it takes longer to absorb the whole with text, so its less apparent that the whole thing doesn't quite come together.

My problems was with Claude's prose and ideas is that it kept recycling the tropes and phrases after a while - something that has been observed that these models have very strong statistical biases - when asking for a random number for example, LLMs are far more predictable than even humans, this shows up in unguided writing exercises.

But as for actually crafting text that is both terse and to the point - such as oneliner explanations, or writing summaries - these models are quite bad. The best I have seen is they could turn a given length of prose into an even longer version - with generally some loss in the tonal accuracy or the points made in there.

As such they are a terrible tool for professional communication, but unfortunately, lots of people have started using them for exactly that.


Um. Perhaps 'pro writers did come up with an absolute crapload of interesting ideas' would be better writing than 'dearth', which means scarcity and famine?

I get that it sounds clever but that's the damn problem!


Depends on the type of writing. Blogs and the like? LLM generates prose that, to me anyway. is unbearable.

However, in fiction I’ve found it a useful collaborator. There have been more than a few occasions when, given some notes of how I want a character’s arc to develop in a particular scene, that the LLM gives some excellent pointers and ‘new’ ideas I hadn’t considered.

As far as editing my prose, I use it as a ‘thesaurus of phrases.’ When lazy, I can give it a rough sketch of the paragraph, giving it the gist of what I want, and have it generate a dozen or so versions. I usually can find nuggets of good phrases therein that are useable… much as I would refer to Roget’s to find a more precise word.

That said, one has to resist tbe temptation of using a chunk of generated text verbatim; no matter how good it sounds in isolation, the repetitive grammatical structure and other LLM-smells add up quick and become nauseatingly obvious if used frequently.

In any case, I think LLM’s get a bad wrap for writing… when used correctly, it is incredibly useful. And, it’s tiresome to hear pretentious snobs assume that an author who uses LLM simply lacks the taste to appreciate how bad the prose sounds. Not true in all cases.


> I can't fathom thinking that LLM writing is even remotely passable. People that think this should honestly read more.

This makes me think you're only exposing yourself to high quality writing online and from an intelligent circle of friends and coworkers. The average person's reading and writing abilities are _atrocious_ and only getting worse. We're almost at the point where kids are communicating through abbreviations and emojis exclusively. LLM prose is significantly better than what the average person can produce.


Someone way more eloquent than me should write a column titled "Why do we read?"

Way back in the past (around 30 years ago) I remember reading an article on "how to read a book" or a similar subject. They argued that, you should not skip the acknowledgments, preface and other "personal" related sections of a book, because it was there where you got a glimpse of the person that was writing the book. The idea being that, you should had in mind that the person writing was explaining something through you.

Carl Sagan even has a video where he argues Books/Writing is some sort of communication through time.

Now, this has been the case historically: A person writes some text (even in botched language like my writing, as English is not my first language) with thinking that someone else in the future will read the ideas and reason about them.

But what about text written by an LLM? Does it have inherent intention? When reading LLM text, it feels like looking at those "this is not a person" photos. Yeah, they are words, yeah they form sentences and paragraphs but... they lack "soul".


It's not "Why do we read?" but something related that is coming up a lot in my thinking lately is Walter J. Ong's "Writing is a Technology that Restructures Human Thought".

Isn’t “Writing is a Technology that Restructures Human Thought” another way of saying that “feedback has an effect”?

If so, this seems to be a trivial (still worthy) assertion.

For example, I intend to, say, construct a shed. I make mistakes that I only see because I actually constructed. I revise future endeavours involving sheds.

I admit to not having read this piece, and am merely reacting to the title.

—-

Okay, I got through the first paragraph of Walter’s writings. While I nod to the bitterness (I assent to the existence of it), I do not bow.


Do you think the first paragraph is enough of a basis to form an opinion from?

Not normally, no. Can you point to a divergence of the bitterness in the subsequent text?

What I find to be the normal pattern (by intuition) is that the condensed leading text belies the expansive following text. This is likely lazy (a shortcut) and I am open to correction at your effort. If a call to your effort (I apologize) is unpalatable then I concede.


> Way back in the past (around 30 years ago) I remember reading an article on "how to read a book" or a similar subject. They argued that, you should not skip the acknowledgments, preface and other "personal" related sections of a book, because it was there where you got a glimpse of the person that was writing the book. The idea being that, you should had in mind that the person writing was explaining something through you

Maybe? That is one reason to read, but there are a lot of other reasons, too. It doesn't mean you are doing it wrong if you want to read something and don't care at all about the person who wrote it.


Yeah, but when we talk about food, there are different tastes, and there is stuff like "you can also use it as a doorstop". Fine, but that doesn't make a doorstop food.

Are we also saying it's acceptable to feed people junk because it's better than what they would cook?

At some point you're just making bad excuses for false scarcity.


I think it's both true that most LLM writing ("writing") sucks and that it's better than what a lot of people can produce unassisted. Which to me doesn't mean that we should roll over and accept LLM output as a lesser evil... it just means that the bar is so low it might as well be in hell, and rapidly getting lower :')

It's nowhere close to good writing, but it's better than the dreck many self-published writers produce and sell - successfully.

But that's the real problem with LLMs. Culture is aspirational. It has a consistent goal - find the best, highlight it and distribute it so others can build on it.

LLMs are the opposite - produce as much of everything as possible at the lowest possible barely-acceptable-if-you're-lucky quality.

This was already a problem before LLMs. Mass market content farms - Kindle Unlimited, Wattpad, Spotify, social media in general - give everyone an equal voice, with mass popularity and "likes" as the only metric.

Now LLMs are automating the creation process, so everyone gets more of everything.

Except inspiration. Not so much of the "That's astounding, I wonder if I can learn from that and reach for something in that league."


It’s acceptable for someone to buy a ready meal or takeout if it’s better than what they can cook. Why wouldn’t it be? Is that the greatest choice for their personal development? Probably not, but life is complex and folk have limited capability and bandwidth for acquiring skills.

They weren't saying it is aceptable, or making excuses, just stating how things are. Average reading and writing abilities seem to be dropping noticeably in many circles. Probably as a consequence of falling attention spans rather than an issue in is own right.

Tell me your thoughts on the quality of LLM-generated code. I've never understood this attitude where people are absolutely disgusted by the slightest whiff of AI prose but will happily slurp up AI-generated code by the bucketful and proudly proclaim that it's OK because it's better than the average developer can produce.

The key difference is that code is not the end product, but writing is itself the product. (No one's doing "vibe-product-management" for example.) Tbh, I still think code can have a beauty and elegance to it (like a logical proof can, or like a mathematical theorem can), but there's a difference between the two and I'm way less forgiving of AI writing than I am of AI code, especially considering most code (by line count) is just boilerplate anyway.

> The key difference is that code is not the end product

I think this is open to debate. To me, the code has always been the goal, and the fact that writing it sometimes serves to produce a product is important to others (and what brings the paychecks in), but ultimately not something I've ever been excited about or interested in throughout my career. So I judge a developer based on the beauty and quality of the code he produces, just as I judge an LLM by the same sorts of things.

The fact that AI can one-shot a working CRUD app is not really that interesting to me. If it could make the code beautiful, concise, maintainable, extensible, minimal, performant, readable, and bug-free: a work of art and love that a craftsman would be proud of... that would impress me.


Imo, this is like saying "I judge a carpenter based on how straight they can cut a piece of plywood." Or like saying "I judge an artist on how accurately they can draw a circle by hand."

I mean that's certainly one way of looking at it, and both can be impressive technical feats. But most people judge carpenters and artists on their end products, their overall vision, their motifs, their philosophy, and so on. On the other hand, as a trained logician, I definitely see proofs (which, by the Curry–Howard isomorphism, are computer programs) have some degree of beauty-within-themselves, but that's quite hard to achieve. Not everyone is a Gödel, after all.

I also think programming languages, despite being Turing complete (which is frankly not saying much), are far too limiting to truly construct magnificent things with.


No, it's more like saying "I judge an artist on my terms regardless of how well they sell on the market".

> artists on their end products, their overall vision, their motifs, their philosophy, and so on

The main output of programmer's work is their understanding of the system they work with, the rest comes from that. Behind the code there's its author's intention, vision, their tastes, philosophy and experience that makes them tackle problems in specific ways. Code review is, aside of quality assurance, mostly about communication between people, convincing them to your ways of doing things (or getting convinced by others) and communicating needs. It's what keeps projects running and what makes people improve their skills.

You don't need to see magnificence in code to realize that there's more to it than just the syntax tree to compile.


> No, it's more like saying "I judge an artist on my terms regardless of how well they sell on the market".

I feel like I need to push back here, because some of the best programmers around: Carmack, Torvalds, Johnathan Blow, even folks that make programming languages like K&R, Rob Pike, etc. are judged on their respective end products, not on minutia found in code reviews. For example, if I asked you "why do you think Stroustrup is a good programmer?"—you wouldn't cite some obscure optimization he came up with, but would rather talk about his overall vision for C++, his ideas of evolving C, his staunch anti-GC takes over the years (and their justification), etc.


You're contradicting yourself. First you say that they're judged on the end product, then you mention things that are very clearly not end products but thoughts and visions behind them that only lead to end products.

Frankly, I have no real idea of how good Carmack, Torvalds or Blow are as programmers, I have never worked with them so I don't really have a way to tell (even though I do contribute to Linux and I've seen some of their code). They're likely past a certain above-average threshold, but they haven't got famous for their programming skills.

That said, if you think Torvalds isn't being judged on "minutia found in code reviews", I'm not sure your take is very serious in the first place - that's the main thing he was being judged on for decades now :)


> You're contradicting yourself

How?

> you mention things that are very clearly not end products but thoughts and visions behind them that only lead to end products

Thoughts and visions are much more closely intertwined with end products (in fact, likely supercede them) than some random code review is, so I'm not seeing where the contradiction lies.

> that's the main thing he was being judged on for decades now

Linus hasn't written any code[1] in at least half a decade+. To argue that he's being judged on his code misunderstands why Linux became so popular to begin with.

[1] https://linux.slashdot.org/story/20/07/03/2133201/linus-torv...


Either I'm bad at communicating today or you're bad at reading, because you're now using my points, so I'm not sure what to make out of it. Let me repeat myself then:

> Code review is (...) mostly about communication between people, convincing them to your ways of doing things (or getting convinced by others) and communicating needs. It's what keeps projects running and what makes people improve their skills.

The way he does that is exactly what most news stories about Torvalds have been focusing on for many years now. In practice, unless you run a project alone, code review is where thoughts and visions surface up the most. Or, well, should be - not everyone is good at it.

(that said, even though my point is that's he's obviously not being judged on his code, you can easily find code that he wrote as late as this month, so your statement is clearly wrong even if that doesn't really influence the discussion here - code review is still the vast majority of his job, just like he stated there under your link)


> Either I'm bad at communicating today or you're bad at reading

Could be both :)

The way I look at it is like this, and you could call this my thesis: I do not categorically think that code in itself is primarily relevant to us looking at a "software engineer" and saying "wow, she's good." The product (the Linux kernel, in Torvalds' case) is, on the other hand, what actually matters. I think we're getting caught up on the idea of a code review; a code review can serve many purposes, as a code review is basically just people talking about the code, the product, their feelings, and so on. Sure, sometimes it's like "this `i` should be a `j`", but other times it's "this should serve feature X, not feature Y."

Overall, I don't think Torvalds is judged by his code quality. And the snippet I cited is the man himself saying "I don't write code anymore" so I took that at face value, even though my conviction stands wether or not he actually does still write code. I don't think anyone actually cared that much about his code quality (maybe with the caveat that the kernel didn't crash).

PS: I could be totally wrong, and this is an interesting & stimulating conversation, regardless.


What I'm trying to get at is more like: I judge a carpenter based on how beautiful, minimal, and functional he makes a chest of drawers, not based on how quickly he can go to market with particle board and glue."

I'm not sure if your question is serious, but I've been a developer for over a decade now.

I write code for a living mostly by hand. In the odd case where I need help I still use google like I always have. I spend more of my time in meetings or staring at the ceiling than writing code. This was also true a decade ago before LLMs. It was also true several decades ago when someone else's ass was in my seat.


...or read.

At least in the USA: 21% of adults in the US are illiterate in 2024. 54% of adults have a literacy below a 6th-grade level [1].

1: https://www.thenationalliteracyinstitute.com/2024-2025-liter...


In an international comparisons, USA comes out somewhere in the middled of developed countries - as it always did. And the differences between countries are not that large.

54% of adults have a literacy below a 6th-grade level is simply not a some kind of catastrophe and it does not mean those people cant read. There is this idea that 6th-grade level is almost like not knowing how to read, but that is simply not the case at all.


Hasn't this always been an intelligence thing? I think across all societies and eras we find that generally a a rather alarmingly large section of population is unable to grasp basic written instructions - and that section usually increases to the majority of people, when we start getting into things like an employment contract, or mortgage document.

This is just the last domain people can desperately cling to because style is totally subjective.

Really hard to take your comment serious when the only post on dvt.name is a hello world page, because at least OP is trying to publish and you are lacking moral high ground to judge him thinking LLM writing is good.

Oh if I had a nickle for every web domain I bought and put a hello-world.html into s3 and never checked again ...

FWIW, I'm with GP. It's quite easy to get just mind-numbingly tired reading beyond the first two sentences of a typical LLM output, let alone on something I'm familiar with.


It's on dvt about page in HN, so hardly something hidden. People are different, and in the blog post itself the author writes that in time he became tired of the way LLM wtites

I'm trying to playfully divert away from the captious and unhelpful comment, but if you want to double down, that's ok too. Cheers, my dude, have a good Thursday.

Sure whatever, why even bother commenting if you didn't want to engage then. I don't owe you anything just because you were trying to cheerfully diverge.

Same to you though, have a nice day


Lol my blog was hacked recently and I've been lazy about moving my backed-up mySQL DB to the new WP installation. Not sure where moral high ground enters the picture. If I really wanted to be an asshole, I'd cite a book I co-wrote and another I edited.

> Honestly, I can't fathom thinking that LLM writing is even remotely passable. People that think this should honestly read more.

How do you think the author of the page would read this? That sounded pretty asshole-like for me. If it's not for you I'm really sorry for you, you must have to endure really screwed up people.


Maybe you're right and I was a bit too snarky, apologies to the author if he/she was offended. Writing anything implies some vulnerability, and criticism should be constructive.

I know, and we've all been there. It's comfortable to criticize, only when I had a very divisive publication hit front page that I've seen how hurtful dismissive or sarcastic comments can be (https://news.ycombinator.com/item?id=45277346)

And sorry about your blog :/ didn't know it was hacked. Looking at the comment section of the hello world though it gets pretty obvious LOL. You should consider removing it from your HN about though.


I dabble in drawing and I find LLM images (and maybe some non LLM one) abhorrent. As for why, the reason I can think of are: no consistency (perspective, small details, and color theory) and too much details making it a visual noise. In most painting, the artist will have a subject that is most detailed (to draw the eyes) and from there, the lost of details will follow some kind of logic. This is how you pinpoint what the artist is most interested in. LLM looks like a filter applied to a montage of pictures.

It's like a gross looking slice of pizza, it's mindbending because at first it looks good, after all it's pizza, but something in it makes it really disgusting

Maybe we're looking at different pizza slices, but all I see is bread, tomato sauce and cheese, and it all looks delicious.


> A general pattern for LLMs is that they look really good at things you are bad at.

Naah I disagree with this. I think LLM's are good at gas-lighting you into thinking that good writing only comes in one flavor. And LLMs prefer a very "textbook/technical-manual" coded flavor of writing because maybe that way they are more useful to us humans. But human writing is not just about crafting the most elegant sentences. Sometimes great writing is just this doggo-drawing meme:

https://knowyourmeme.com/photos/2160304-the-winner-of-this-c...


> What that means is that if you find yourself thinking of its output as significantly better than yours in a particular domain, there's a high chance that you are not equipped to judge that quality effectively.

This is why code generation is a disaster waiting to happen. Hunderds of thousands of "programmers" with no idea of what they are pushing to production.


You can triangulate. Ask it the same thing in different ways and with different LLMs. Operate in domains where the output is verifiable, like in the sciences but in terms of numerical computing. Study the output, graph it, learn it, reason with it, rinse, and repeat until your mental model makes practical sense.

That's because LLM output is "average"; so if you're below, it will obviously look better than what you can do, and vice-versa. It will be interesting to see what happens when current LLM output becomes the bottom, as everyone worse has pulled themselves up to that level.

the other day there was a hackernews comment about ai-generated music, and this poster claimed that a friend generated ai music and got as much enjoyment as actual ones composed by musicians. I suppose this falls under the same camp..

So what does this mean in practice, though?

Let's say you are correct.

You ask an LLM to write something for you, and to you it looks really, really good. So based on your conjecture, that means I am not a very good writer.

Ok, but how does that change what I should do? If I am not a very good writer, that means an LLM IS actually better than me, even if it might not be objectively good to an expert writer.

My two choices are to keep producing my own crappy writing, or use an LLM to create better (but not great) writing.

Wouldn't it make sense to use an LLM?

It seems to me your premise leads you to the same conclusion you would reach even if your premise was false; if me thinking an LLM is good at a task means I am very bad at that task, I am probably better off having an LLM do it. On the other hand, if you are wrong, and I think an LLM is good at something because it actually IS good at that thing, then I should also use the LLM to do the task.

Either way, the LLM is better than me at the thing.


Well, no. The LLM is just better than you in a narrow band that appears wider to someone below than above.

From above, or from below with adequate exposure, it feels facile and hollow. It is good at weaving grammatical structures. It is not good at thinking in words in a way that invites a fellow human along for the journey. Because it doesn't think.


Well, practice to get better at writing (and, therefore, judging writing) yourself. It seems obvious. Your skills are not frozen in time and set in stone.

> My two choices are to keep producing my own crappy writing, or use an LLM to create better (but not great) writing

Are you also an LLM or do you have the capacity to learn and grow?


Well, the third choice is to develop as a person and become better at writing. Which you do by doing some crappy writing and learning from it.

>If I am not a very good writer, that means an LLM IS actually better than me

If you're not a very good writer, I'll at least skim your work to see if it contains any good ideas. If it's slop, I'll just close the tab. You already told me it's not worth caring about, so I'll agree with your decision.


I mean, you have the ability to learn to do stuff better to a certain extent, so it's not like your only choices are "suffer through the writing I'm capable of producing today for the rest of my life" or "give up on ever writing anything myself". Writing stuff yourself is pretty much a requirement of getting better at it, and arguably even if you do intend to use LLMs to supplement it, having a better baseline will be valuable for additional iteration with the LLM.

This is true, but what is also true is that with each new generation of models (and not just for code generation) it becomes less and less true.

IMO LLM writing hasn't significantly improved since maybe GPT4. It still does the exact same "It's not x, it's actually y" tropes and many of the other common LLM smells. Most LLM generated text is immediately discernible as such.

You can avoid the smells with a prompt. I have a benchmark involving short story writing within specific styles and the level of sophistication achievable is increasing over time, in my opinion.

Mnemonic: geLL-Mann amnesia effect

Cuts very close to the Dunning-Kruger effect.

It's basically just another instance of Gell-Mann Amnesia. Ask an LLM to discuss a topic you are an expert on, and you will realise it is full of errors, but ask it to discuss a topic you know nothing about and you will, mysteriously, assume it is very intelligent and correct.

> Because Trump is a new, and (hopefully!) a one time phenomenon.

Trump is already, on his own, a two-time phenomenon. Leaving aside broader cultural issues and patterns, "one-time only" has been clearly incorrect for a while.


> There will be new value created by these models which people are happy to pay for which simply did not exist at all before

What sort of new value, and why will people pay for it from someone else rather than prompting for it themselves?


1. When and how would gcc go down?

2. How often do you think that happens, compared to Claude?


You can use a local model, which will go down exactly as often as gcc will. We may still have hopeful notions of being able to understand the codebase, but the reality seems to be that the codebases we don't understand will be the ones that will win out in the market, because they'll be cheaper while still only having about as many bugs as they had when people wrote them.

We're explicitly not talking about local models here; we're talking about Claude.

Because you're better able to take over the codebase a local model wrote than one Claude wrote? The original question was about taking over an LLM-written codebase, it doesn't sound to me like the argument was about a codebase that Claude, specifically, wrote.

The original question is:

> What happens when you have a codebase made with claude using this setup and claude is down for let's say 8 hours?

So: - A codebase made with Claude - Using this [Claude] setup - Claude is down


What does it matter what the codebase is made with? If Claude is down, use Codex, or Gemini, or Deepseek. That version of the argument is just way too easy to counter.

Brother, look at the first comment in the chain you replied to. It very specifically was about Claude.

Well, in that case, it's also very specifically about this guy's codebase, so none of us can really say anything on this.

The author's voice in that comment doesn't sound like their writing in the article(s).

Maybe they're being sincere, and they just edited it so heavily with AI that it comes out sounding generated, but that's a distinction without a difference.


Humans do it sometimes, for effect. Not all the time, giving every phrase the same impact.

Grimm's Fairy Tales also have had an important impact on your culture.

No one is asking you to believe in anything, but it's self-limiting to refuse to engage with works of historical/cultural importance.


You’re projecting or misinterpreting my comments. I didn’t say anything regarding the content I engage with.

However I reject the idea that engaging with religious texts is insightful and something to promote


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: