Hacker Newsnew | past | comments | ask | show | jobs | submit | zephen's commentslogin

There is always a chance that the LLM will hallucinate something wrong. It's all probabilities, quite possibly the closest thing to quantum mechanics in action that we have at the macro level. The act of receiving information from an LLM collapses its state, which was heretofore unknown.

However, your actions can certainly influence those probabilities.

> If asked properly, LLMs can be used to poke holes in an existing reasoning or come up with new ideas or things to explore.

Since, at the most basic level, LLMs are prediction engines, and since one of the things they really, really want (OK, they don't "want", but one of the things they are primed to do) is to respond with what they have predicted you want to see.

Embedding assertions in your prompt is either the worst thing you can do, or the best thing you can do, depending on the assertions. The engine will typically work really hard to generate a response that makes your assertion true.

This is one reason why lawyers keep getting dinged by judges for citations made up from whole cloth. "Find citations that show X" is a command with an embedded assertion. Not knowing any better, the LLM believes (to the extent such a thing is possible) that the assertion you made is true, and attempts to comply, making up shit as it goes if necessary.


I find this extremely surprising.

When you think of everything it takes for an AI to use what the article calls a "vision agent" then it seems as if using a purpose-made API ought to be MANY orders of magnitude faster.


That's fine, but you know you have to concatenate them and sell them as one unit, right?

So, the craftsperson looks at actual artifacts, but the engineer reads specs and drawings.

> but those are often just spells the software magician invokes after reading the ... documentation.

Oh, but the craftsperson does read shit after all.

Do you even read what you, yourself write?


I like the idea of being able to merge a PR that is a partial solution, while keeping the issue open to reflect that it is only partially done. It kinda makes sense to do this in a single action.

Also:

> If [a person is not suitable to make the decision of whether the PR should be approved] then the person should remove themselves from the list of reviewers.

This doesn't reflect what sometimes happens in real life. Someone could have sufficient specialized knowledge to be able to veto a PR, without having sufficient broader knowledge to approve a PR. That person should definitely be left on the reviewer list, with the ability to veto, the necessity to remark if he has vetoed or not, and the inability to definitively approve.

It is necessary for this specialist to notate "I have finished examining this PR, and there is nothing within my expertise that would cause me to veto it" before the PR is advanced.

Unfortunately, in a binary system, that often equates to him having to say "I approve" even though this does not truly capture the intent. Then you wind up with hacky work-arounds, like requiring a minimum number of approvals.


> I've never seen a legitimate business not give refunds for technical errors of their own fault.

Granted, it was very much weasel words.

Nonetheless, I read it as they were issuing a refund ("Let me look up your account information to help process your refund request."), but couldn't offer compensation for pain, suffering, loss of use, tracking down the bug, etc.

I could be wrong, of course, precisely because it was (probably AI-generated) weasel words.


> averaging their announced results.

Obligatory XKCD: https://xkcd.com/937/


> In any well run organization you have multiple layers of controls.

Everything depends on size.

A business with 8 employees might need 3 of them to be (literal) keyholders, and might be situated such that any of the keyholders has it in their power to destroy the business.

This is not ideal, obviously, but it is how the world has worked for a very long time, and it is difficult to understand how to make it better in some cases. Modern technology, such as cameras, might help, or might simply help to allocate blame after destruction has occurred.

In any case, this is the background of how people are used to working. We all deal with people who can absolutely destroy us, starting with the cop on the corner.

And we have mechanisms, both before-the-fact, like social coercion, and after-the-fact, like the legal system, to help ensure that this usually works.

LLMs exist in a world where most people are used to extending trust, but it isn't possible for LLMs to conform to the historical expectations that underpin that trust.


I thought it was a cross between a camera and a bomber.


This stupidity might go a long way towards explaining the relentless push towards apps.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: