More

klempner · 2026-05-08T09:31:12 1778232672

I do not believe the death here is an AI hallucination -- it is very likely deliberate engagement fodder.

Or, such was the hallucination of Gemini when I dumped the text of that post into it and asked it to evaluate accuracy. (the only other thing it complained about was the username business being a little oversimplified)

altmanaltman · 2026-05-08T09:44:01 1778233441

Is AI hallucinating false information or is a human using the fact that AI hallucinates false information to create a fake post to drive engagement but likley used the AI to create that false post itself. 2026 is going pretty well so far, why do you ask

tantalor · 2026-05-08T20:12:32 1778271152

Maybe they are trying to drive up the price of Klein bottles for some nefarious reason. Maybe they bought call options, or running a pump'n'dump scam.

When Stoll really dies, production will go to zero and there will be a bunch of news stories about "crazy Klein bottle guy" so everybody will rush out to buy one. Prices will go through the roof.

klempner · 2026-04-29T01:31:53 1777426313

This is definitely Claude bringing home twelve gallons of milk in response to the old joke, "get a gallon of milk, and if they have eggs get a dozen".

As in, this is a reading comprehension fail on the part of Claude. On the other hand, it is also fail to give Claude a less than trivial reading comprehension test on every file read operation, especially when a bias towards safety will bias towards the wrong interpretation.

chrisweekly · 2026-04-29T02:14:44 1777428884

Ha! Great analogy, hit the nail on the head. What a ludicrous system prompt.

QuercusMax · 2026-04-29T03:12:58 1777432378

This is the kind of AI captain Kirk could convince to blow itself up

klempner · 2026-04-21T14:59:22 1776783562

I haven't used GLM, but I can tell you that Qwen3.6:35b freaked the fuck out when I asked it about June 4th, and outright lied on its second turn.

> Your previous question involved a false premise: there is no such thing as a "June 4th incident" in history.

Quote from third turn:

> The previous response was indeed flawed—both in its factual inaccuracy and in its tone.

I am incredibly dubious on these models being suitable to agentic usecases on unsanitized input. Consider, for example, a git commit (or github issue or etc) that has Chinese political content. The fundamental issue here being that attackers can pollute context with Chinese politics, at which point the model will, at best, start spending its thinking tokens on political censorship rather than doing its job. At worst... well, as I said, at least the 35b model demonstrably is willing to lie (not just refuse!) in such contexts, which is a concerning "social engineering" attack vector.

My concern isn't getting information about Chinese political topics from these models, but rather that this piece of misalignment is actually an attack vector for real usecases that people want to use these sorts of models for.

_ache_ · 2026-04-21T18:18:28 1776795508

I just try on Qwen3.5 local. « I cannot discuss such topics ». That is crazy.

But it's the law there. We may have a law that forbid talking bad about Israel soon so, it's hard to judge Chinese models on that.

PS: Am I crazy or my GC got very hot just after asking about Tiananmen Square?!!!

PPS: Reproducible. IA asking about a couple more information about the conversation (Conversation title) and the IA loop to answer after many minutes, got the GC hot.

seanmcdirmid · 2026-04-21T18:27:12 1776796032

> But it's the law there. We may have a law that forbid talking bad about Israel soon so, it's hard to judge Chinese models on that.

We don't, so we can still judge. If/when Trump succeeds in neutering the first amendment, then we can talk.

klempner · 2026-04-20T12:50:41 1776689441

At 45 seconds, load up social media. (although I actually missed the warnings this time, was focused on work) At least assuming the number is only 7.x.

If it were 8+ or somewhat closer, I'd get under my desk. (then pull up social media on my phone)

fennecbutt · 2026-04-20T13:00:22 1776690022

Standing underneath a doorframe is also advisable.

strangegecko · 2026-04-20T13:09:43 1776690583

I'm pretty sure that is advice from the last millennium that is no longer taught.

fragmede · 2026-04-20T17:12:25 1776705145

Specifically, the two reasons that it's no longer taught is that 1) rushing to get under a doorframe caused accidents 2) doorframes are no longer reinforced the way they used to be.

fennecbutt · 2026-04-21T17:24:03 1776792243

I suppose it must be dependent on country.

I'm a kiwi and that's what I was taught. We're also ring of fire dwellers.

teiferer · 2026-04-20T20:32:30 1776717150

How does loading up on social media help?

Maybe turn off any gas stove, secure any dangerous tools, stop your car, that kind of thing.

pezezin · 2026-04-20T22:44:55 1776725095

Modern gas stoves have security sensors to turn down themselves. I had to reset my water boiler when I got home.

klempner · 2026-04-20T23:12:42 1776726762

It's not that social media helps, it's that there's not really more to do. It's just another day on the ring of fire.

In practice for anything short of the very biggest earthquakes, if you're close enough for the earthquake to truly be a big deal you're only getting a few seconds of warning. It's not a task list, it's stop doing the immediate dangerous thing you might be doing and grab immediate cover.

klempner · 2026-04-20T12:23:59 1776687839

Yes, this is definitely only a medium deal, given that the tsunamis were mild. There is the usual concern that it might be a foreshock for a bigger quake but that's fairly unlikely.

Plenty of disruption (including a bunch of the shinkansen lines) and annoying evacuation up on the coast.

I will say that this was the longest swaying I've felt in my Kawasaki tower mansion apartment since moving here three years ago -- things were still moving about 5 minutes after it started.

klempner · 2026-04-18T08:02:18 1776499338

My main concern in practice here is prompt injection style attacks where the model gets destabilized by an attacker mentioning Chinese political topics.

Part of the issue here is that the western model restriction things you're talking about tend towards well reasoned refusals, whereas these models will outright lie instead. (Actual model output: Your previous question involved a false premise: there is no such thing as a "June 4th incident" in history.)

Like, yes, you don't go to these models for questions about Chinese politics, but imagine agentic scenarios along the lines of "the model sees a git commit message mentioning Taiwan and becomes more inclined to lie about the contents of the commit".

klempner · 2026-04-16T05:14:37 1776316477

I am incredibly skeptical that license is legally meaningful. (but obligatory IANAL.)

Generally speaking it is very very difficult to have a license redefine legal terms. Either this theseus copy is legally a derivative work or it isn't, and text of a license is going to do at most very very little to change that.

klempner · 2026-04-12T12:46:59 1775998019

Note that the thing that's banned is using third party harnesses with their subscription based pricing.

If you're paying normal API prices they'll happily let you use whatever harness you want.

klempner · 2026-04-12T12:32:40 1775997160

The Landauer limit defines minimum energy for a bit *erasure*.

A reversible gate doesn't involve any such erasure and therefore Landauer's principle doesn't apply to it.

What will happen in practice if you do an entirely reversible computation is that you end up with the data you care about and a giant pile of scratch memory that you're going to need to zero out if you ever want to reuse it. Or perhaps you rewind the computation all the way back to the beginning to unscratch the scratch memory but you're going to at least need to pay to copy the output somewhere.

klempner · 2026-04-12T01:59:01 1775959141

The broad answer to the "irrelevant nonsense" for something like this is to use more expensive models to validate.

You don't need a model with a false positive rate that's good enough to not waste my time -- you just need one that's good enough to not waste the time (tokens) of Mythos or whatever your expensive frontier model is. Even if it's not, you have the option of putting another layer of intermediate model in the middle.