More

wongarsu · 2026-03-13T14:12:43 1773411163

The California law only stipulates that there's an "accessible interface at account setup" to set the birthday or age at account setup, and an interface to query the age bracket. Plus the crap for "application stores"

I don't think it's a very well thought-out law. But realistically this will end up as setting some env variable for your docker containers to assure them that you are 99 years old. And yes, maybe transmitting a header to docker hub that you are 99 years old. Probably configured via an env variable for the docker cli to use. It's stupid, but nothing a couple env variables wouldn't comply with

The real issue is when the law inevitably gets expanded to get some real teeth, and all the easy workarounds stop being legal

whywhywhywhy · 2026-03-13T14:20:00 1773411600

So once my application is running I can just keep querying an age bracket until it flips and then I've successfully determined a date of birth.

bee_rider · 2026-03-13T14:28:57 1773412137

This is a neat attack (in that it is obvious and a big flaw but also it makes sense that the lawmakers wouldn’t have thought of it), but it would only affect users who have an age-bucket transition while your application is running, right?

Edit: as folks have pointed out, the attacking application doesn’t actually have to be running while the age-transition takes place. The attacker just has to have logs from before and after the age transition, and then they can narrow the birth-date down.

bigfishrunning · 2026-03-13T14:46:02 1773413162

Not necessarily, depending on how the application is logging it just means the resolution to which you know a birth date is limited by how often the application is run. If i check my email every morning at 8am, and my email app logs my "age bucket", then it can know to a resolution of one day. If i only check my email on Monday mornings, it knows to a resolution of one week, etc...

wongarsu · 2026-03-13T15:06:05 1773414365

The size of the age bracket also puts practical limitations on it. There is only one mandated bracket for everyone who's at least 18, preventing that attack on anyone who starts using your software after their 18th birthday. And if a 13 year old signs up it takes three years for you to observe the switch to the >=16,<18 bucket

bee_rider · 2026-03-13T15:49:12 1773416952

> And if a 13 year old signs up it takes three years for you to observe the switch to the >=16,<18 bucket

I think this is the big vulnerability in the scheme. This information is easy to track and log, so it is basically equivalent in the giving away the DOB of everybody who is currently under 18 (at least, everybody who uses the system as intended). In the long run that’s everybody.

We could have a discussion about whether or not it would be fine for services to know every user’s DOB, but it is clearly giving away more information than the law intended.

> There is only one mandated bracket for everyone who's at least 18, preventing that attack on anyone who starts using your software after their 18th birthday.

I don’t think that fully recognizes the size of the problem, “using your software” is fuzzy. Companies get bought, identities get correlated, ad services collect and log more information than needed. I think it is better to assume the attacker will have logs of these queries from the start date of a person’s first account.

AlotOfReading · 2026-03-13T14:44:55 1773413095

Then you store the user age every time it's run and check for changes on start. Maybe that only gives you a 7 day range for birthdays, but you can narrow that over time and it's still good enough for targeting.

bee_rider · 2026-03-13T14:58:49 1773413929

I agree, sorry, I think my original comment was a little imprecise. My point was that the app can get the “exact” age only for users who undergo an age-bucket transition in an era that the app has logs for.

I mean, the app can query on a weekly basis, and then if you go from “under 18” to “over 18” it knows the week that you were born in. But, if the user was already an adult when the logging started, there isn’t a transition to go off.

gzread · 2026-03-13T15:05:24 1773414324

The UI can be implemented using the user's date of birth, but it can also be implemented by selecting an age bracket and then all it tells you is that the user changed the age bracket setting.

LtWorf · 2026-03-13T15:23:48 1773415428

Age brackets cannot update themselves.

pas · 2026-03-13T14:45:20 1773413120

is there any mention of granularity? so if the user sets their age bracket, then there's no DoB stored. if the user is old enough to fall into some other age bracket they can set that if they want. (and then somehow making this a bit more data driven - ie "verifying" - is a different matter altogether.)

bee_rider · 2026-03-13T15:03:40 1773414220

IIRC the age buckets were defined in the California law. They were something along the lines of age ranges that would intuitively map to adults, teenagers, and kids, I forget the exact borders.

I think the intent was for the OS to know the user age, but only provide an age range, so it could automatically upgrade people as they aged (but I could be wrong about that).

charlieo88 · 2026-03-13T14:58:13 1773413893

assuming it flips, and you aren't locked into that age bracket for the duration of your OS

browningstreet · 2026-03-13T14:28:31 1773412111

Gavin said he's open to amending the law. I hope someone's taking him up on that..

wongarsu · 2026-03-13T15:47:42 1773416862

Yeah, I think the idea of the law is fine. If you imagine "Operating System" to mean "things like Windows and iOS, or Desktop install of Fedora", "Application Store" to mean "Microsoft Store or AppStore or the like" and "Application" to mean "Word and Doom and stuff like that" then it's fine. Especially if you keep in mind that there isn't any actual verification of the age, it's simply set by whoever sets up the account

Most of the issues only arise because in the bill "operating system", "covered application store" and "application"/"developer" have very loose definitions that match lots of things where the law doesn't make sense.

parineum · 2026-03-13T14:37:21 1773412641

Gavin doesn't make laws and should have vetoed this one.

gzread · 2026-03-13T15:06:02 1773414362

Why, what's wrong with it?

vscode-rest · 2026-03-13T14:39:02 1773412742

Laws get made by whomever takes Gavin to the most dinners at the French Laundry. Don’t like this law? Good luck - reservations are booked out 6 months in advance.

slopinthebag · 2026-03-13T14:46:16 1773413176

> The real issue is when the law inevitably gets expanded to get some real teeth, and all the easy workarounds stop being legal

Which will happen. The road to hell is built one brick at a time.

kevin_thibedeau · 2026-03-13T14:30:13 1773412213

We'll call the query tool jackboot.

wongarsu · 2026-03-13T08:33:44 1773390824

AI facial recognition is smarter than what they are capable of. That's not the issue. It is much faster than a human, and state-of-the-art models make fewer errors than a human (though the types of errors are not the same).

The issue is that facial recognition is just not very reliable. Not for humans and not for machines. If you look at millions of people, some of them just look incredibly similar. Yet police apparently thought that was all the evidence they will ever need. A case so watertight there's no point in even talking to the suspect

wartywhoa23 · 2026-03-13T09:48:03 1773395283

So the sane solution here is just leaving unreliable stuff to humans and reliable to machines. Especially so when human wellbeing and freedom are at the stake.

To define the line between the two, calculate the percentage of cases when mainstream CPUs return anything but integer 4 after addition of integer 2 and integer 2, and use that as the threshold to define "reliable".

wongarsu · 2026-03-12T22:37:23 1773355043

It's meant as a "yes"/"instead, do ..." question. When it presents you with the multiple choice UI at that point it should be the version where you either confirm (with/without auto edit, with/without context clear) or you give feedback on the plan. Just telling it no doesn't give the model anything actionable to do

keerthiko · 2026-03-12T22:39:45 1773355185

It can terminate the current plan where it's at until given a new prompt, or move to the next item on its todo list /shrug

wongarsu · 2026-03-12T13:55:00 1773323700

A lot of asteroids are much less solid than we used to think. Some of them are big rocks, but many of them are just piles of sand- and gravel-sized material loosely held together by gravity. Clamps work great on the solid rock type, but many of the alternative methods - including smashing into it - work on asteroids of any composition

That's valuable not only for versatility, but also because it would really suck to send a spacecraft on a redirect mission only to find out that our assumptions about the asteroid's composition were wrong

wongarsu · 2026-03-12T13:47:58 1773323278

You know what they say: the best planetary offense is a good asteroid redirect program

It's also the best planetary terrorism, going by the plot of The Expanse

m4rtink · 2026-03-12T23:36:30 1773358590

It only worked in the Expanse because they expertly choose a special trajectory that made the rocks hard to detect and some questionable (but plot necessary) "stealth coating".

By this point UN and MCR have been in cold war for 100+ years staring each other down with region killer nuke arsenals and an absurd amount of interceptors always ready. See than one time Mars actually fired a barrage - only like two warheads got through, only due to shitload of decoys and overall numbers.

A dumb rock would totally get vaporized without the plot armor in a safe distance.

Ajakks · 2026-03-13T04:42:19 1773376939

Ok. The Expanse is a show/book - that doesn't actually portray any of this very well, but its very important to note - there are no satellites in orbit, with nukes or any kind of missiles - if you want to pretend they are, they are most definitely pointed at the earth.

I'd love to buy into that plot armor but there is too much to take seriously by S6. The reality is, the first time a colony decided to it was independent enough, to use an asteroid - they would pick one, or many, so as to to render earth uninhabitable, there is no doing what they did in the show - thats how you lose a war AFTER having already used a weapon of last resort.

Once Inoki or w/e his name decided to use an asteroid, and one hits, the ONLY choice open to Earth is an immediate unconditional surrender. The only correct choice for asteroid #2 is one that will end all life on the planet without any doubt.

What's her name? The President would have killed us all and attained nothing doing so.

m4rtink · 2026-03-13T11:31:50 1773401510

I think given the technology that has been shown - a massive space and planetary infrastructure base, torpedoes with torch drives armed with nukes that would make Teller blush - I don't think you can actually use Dinosaur killer asteroid unnoticed.

That would be far too big to not be spotted by the many UN aligned sensor platforms all around Sol, well before it is actually on a collision course as changing the trajectory of something this massive could take a long time, not to mention for it to actually travel all the way to Earth on that trajectory.

I am sure that Belter Cheguevara was not the first one to get these ideas, so any major power not tracking most asteroid orbits in almost real time at this point would be stupid. The technology they demonstrated to have should easily allow that.

And by that point one of the many Ships UN has all around the system would just go there and shoot anyone working on the big rock to pieces. Possibly deploying tugs to change the trajectory to a safe one afterwards.

So I think they had to use rock small enough not to be easily tracked, that could be quickly accelerated + that special stealth coating from the Martians. Enough to kill a city and devastate a region but not much else.

redman25 · 2026-03-12T14:05:23 1773324323

First things first, we have to colonize the rest of the solar system before we can terrorize Earth.

dylan604 · 2026-03-12T14:30:39 1773325839

That totally depends on the type of super villain organization we're discussing. Some are willing to watch the Earth burn making the colonization step unnecessary. Others think humans are the problem and again would be willing to skip that step.

MisterTea · 2026-03-12T22:09:52 1773353392

> before we can terrorize Earth.

Before?! We're already doing a great job at it!

vova_hn2 · 2026-03-12T15:50:57 1773330657

It depends on the size of the asteroid and precision with which it can be aimed...

Ajakks · 2026-03-12T19:06:52 1773342412

Why exactly? I think the US ought to spend a few trillion on an actual space battleship - one that never comes down to the surface, just sits in orbit. There was a project regarding dropping telephone pole sized pieces of metal from space as an offensive weapon - put something like that on the space battleship and...

That is simply "Assured Destruction" with absolutely no mutual drawbacks or lingering consequences like radioactive wasteland. Just craters.

This is also something where the 1st country to achieve the "Space Battleship" could effectively prevent any other from also doing so...

In theory, Bezos or Musk could do it.

I don't understand why any country would bother with ground based military assets at this point.

wongarsu · 2026-03-12T22:31:00 1773354660

> That is simply "Assured Destruction" with absolutely no mutual drawbacks

Nuclear countries would simply declare that they will launch nukes if any rod comes down on their territory. Even if you had thousands of projectiles in orbit (at considerable cost per projectile) this would not be significantly different from 60s-style MAD: put nukes in bunkers, in the air and in the sea to ensure they can't all be taken out. We might see the return of strategic bombers that stays in the air for weeks at a time.

Alternatively they can just shoot down your battleship with anti-satellite weapons. The risk of retaliation might be worth preventing the disadvantaged position in the long term

Ajakks · 2026-03-13T02:23:13 1773368593

That reaction is not the same tho - a rod isn't even a conventional weapon, I am not certain off hand that an incredibly destructive such weapon would even be banned under current treaties. That matters bc your taking about the end of the world. Only Russia would ever shoot at the US - so, dont drop rods on Russia.

Plus - if countries don't do space wars - this will still happen 100%. It will just be a non-state actor - who do you nuke if Austin Powers is the bad guy from space?

Also, there seems to be a prevailing sense of "we'll just shoot it down" and that is actually extraordinarily unlikely - bc of all the space, in space. I wouldn't sit in orbit with my Space Battleship - maybe a lunar orbit.

Let's say I park halfway to the moon - ALL of my missiles will still hit earth, I don't think current defense systems would have any better odds - whats the difference between an ICBM that enters the atmosphere from space - shot from a silo or a spaceship?? Not much, functionally identical to the Space Battleship... missiles from earth tho, will be like in slow motion, the space battleship ought to be able to literally shoot them down with bullets - none will be able to surprise the space battleship, how do you even do a missle defense overwhelm tactic in such a situation - I can move the spaceship you know.

I may sound like I'm being unserious, but in reality, this is absolutely the future of warfare 100% - I can't be more serious, the humor is bc this topic makes me legitimately nervous.

ryandrake · 2026-03-13T02:35:55 1773369355

Heinlein's The Moon Is A Harsh Mistress kind of mapped out what to expect if/when your adversary manages to position themselves significantly above you in the Earth's gravity well.

Ajakks · 2026-03-13T03:39:15 1773373155

What an incredibly foresightful work - I have not read that, I will tho. Thanks!

And yeah, it is perhaps the most extremely imbalanced strategic advantage that can be attained.

bdamm · 2026-03-12T21:27:13 1773350833

You've described a space station, which three countries have already done independently (Mir, SkyLab, Tiangong).

But dropping rods from an orbiting platform makes no sense. There's a reason that "Rods from God" didn't pan out, and it has to do with orbital dynamics. Neither Bezos nor Musk can do it, because it actually doesn't work.

Ajakks · 2026-03-12T21:48:44 1773352124

I doubt it was seriously considered at the time it was discussed. Space Stations are in orbit - the space battleship doesn't have to be, that is very significant.

Earth is spinning in a giant circle around the sun. Thats facts. "aiming an asteroid" is less of making a rock a missile - and a lot more of tug-boating it into the exact right spot, in the way of earth, so that earth hits the asteroid - not anything complicated like the asteroid hitting earth.

There are a lot of little things like that...

m4rtink · 2026-03-12T23:26:20 1773357980

Any realistic space warship design will need propellant - sure you can avoid ground based interceptors and kill sats but it will eat into your propellant reserves over time.

You will need to replenish from somewhere & that somewhere might as well get nuked instead of the ship, rendering it useless.

aw1621107 · 2026-03-12T22:11:56 1773353516

> Space Stations are in orbit - the space battleship doesn't have to be

I mean, you did say:

> space battleship - one that never comes down to the surface, just sits in orbit.

So I think it's understandable for people to take that at face value.

Furthermore, if it isn't in orbit, then where would it be?

> and a lot more of tug-boating it into the exact right spot, in the way of earth, so that earth hits the asteroid - not anything complicated like the asteroid hitting earth.

From an orbital mechanics standpoint I don't think there's actually a difference. You're changing an orbit either way.

Ajakks · 2026-03-13T02:34:14 1773369254

If I were holding earth hostage with my Space Battleship - I would sit in a lunar orbit. Also, I am not kidding about tug-boating - if I fly up, match an asteroids speed and velocity, why cant I just throw a tow strap on that, accelerate, and park it an area that only has to be accurate enough for a planet to hit it - I dont need to stop it, or have it flying at the earth, it only needs to be in the way, moving a little slower than the earth.

What if I make that the space battleship's job? What if a drone can do that?

Im not really worried about resupplying the space battleship holding earth hostage -> someone will "volunteer" to do that, bc they want to live life.

aw1621107 · 2026-03-13T04:54:32 1773377672

> I would sit in a lunar orbit

Ah, so by "orbit" you were talking about orbit around Earth specifically?

> why cant I just throw a tow strap on that, accelerate, and park it an area that only has to be accurate enough for a planet to hit it - I dont need to stop it, or have it flying at the earth, it only needs to be in the way, moving a little slower than the earth.

Again, from a high-level orbital mechanics perspective there is little difference between the two. You start with two non-intersecting orbits and you're changing one orbit to intersect the other at the same time and place. How you go about doing so is just a question of how much time/fuel you're willing to expend, for various values of "just".

That being said, assuming I'm interpreting you correctly what you propose is probably technically possible (e.g., change an asteroid's orbit to a slightly-larger-than-Earth-sized one), but it's also very fuel-intensive compared to skipping the "parking"/"in the way" part.

If you haven't tried it already I can't recommend Kerbal Space Program enough for experimenting with this kind of thing, especially if you are alright with playing with mods. Real Solar System (changes the in-game solar system to match the our real-life one) and Principia (replaces the simplified patched conics system KSP uses for orbits with n-body gravity) would be particularly relevant here.

Ajakks · 2026-03-13T06:23:48 1773383028

I absolutely will check out Kerbal - I have done nothing more than thought experiments - which I'm sure is obvious, its obvious to me. I'm sure I am saying things exactly wrong - the idea is to save fuel and remove all of the difficulties that may arise with timing or aiming. Using more fuel is exactly opposite intent.

I may be confused but I dont mean a "larger orbit than the earth" -> I mean the exact identical orbit, the exact path that earth takes around the sun -> ahead (or behind, it does not matter) of where we are and instead of 365 days to circle the sun, the asteroid is moving at a rate that will take MORE days -> so the earth will smash into the asteroid, bc it can't do anything else. I dont mean "park" in the sense that I stop its movement, nor would I select an asteroid that has such an orbit that it couldn't be manipulated into position with little difficulty.

Like, imagine the solar system was a record on record player (I've never used one either) and the earth is on a line/groove - a choice asteroid is moving in the same direction on an immediately adjacent line/groove - the asteroid only needs to move onto the earth's groove (anywhere on that specific groove the earth occupies on the record works) and then the asteroid is then sped up or slowed down (not much tho) on that exact orbit -> either will result in a collision with earth.

The only real way to stop such activities is with spaceships. That is my entire argument - you are saying that is less feasible than making a missle out of an asteroid? I appreciate the explanation fr

Tbh, it wasn't until the game Terra Invicta that I really considered the solar system, as it actually is. That game has no other relevance to this particular conversation - good game, very different kind of 4x that I recommend but unrelated.

aw1621107 · 2026-03-13T06:54:38 1773384878

> I mean the exact identical orbit, the exact path that earth takes around the sun -> ahead (or behind, it does not matter) of where we are and instead of 365 days to circle the sun, the asteroid is moving at a rate that will take MORE days

Unfortunately that's not really possible. To a first approximation, Earth's orbit is a circle with the Sun at its center, and the size of that circle is determined entirely by Earth's orbital speed around the Sun. Assuming you're also in a circular orbit, if you move at Earth's speed, the size of your orbit will be the same as that of Earth's. If you move faster or slower, your orbit will be smaller or larger, respectively, unless you wish to continuously burn fuel to maintain your distance from the Sun. That's why I said the asteroid's orbit must be slightly larger than that of Earth's for an Earth-catches-up-to-asteroid-in-similar-orbit scenario.

Obviously things get more complicated once you consider non-circular orbits, but the end result is similar - you can't continuously hang out in Earth's path while moving slower than the Earth around the Sun without burning a stupendous amount of fuel.

> you are saying that is less feasible than making a missle out of an asteroid? I appreciate the explanation fr

I think it's more that I think that "making a missile" is likely to require less fuel since you only need to adjust the asteroid's orbit ~once (only need to get it on a collision course) instead of ~twice (get the asteroid on a near-collision course, then adjust it again for the "right" kind of collision).

Ajakks · 2026-03-13T07:34:41 1773387281

I cant reply to your other comment - that is what I assumed you were saying but it does not make sense to me outside the process that naturally occurs - I'm assuming the suns gravity simply cant move objects of such different mass, at the same rate, and thereby the orbit and position changes accordingly?

The speed doesn't have to be much different - 366 days and earth will eventually hit asteroid - 364 days and it will eventually hit the earth.

Ahh, Im still having a hard time figuring out why that would take more energy - I'm going to be researching this all morning tomorrow.

Thanks for the help!

aw1621107 · 2026-03-13T08:34:04 1773390844

> I'm assuming the suns gravity simply cant move objects of such different mass, at the same rate, and thereby the orbit and position changes accordingly?

Kind of? An object moving in a circular motion at a constant speed must have an acceleration towards the center of the circle of (velocity^2)/(radius). This means that two objects in the same circular orbit moving at different speeds must be experiencing different accelerations towards the center of the circle.

In the simplified case of orbits around the Sun, that acceleration towards the center of the orbit is due to the Sun's gravity. However, gravity accelerates everything at a given distance at the same rate. As a result, you can't have two objects solely influenced by the Sun's gravity that orbit around the Sun with the same orbital shape but moving at different speeds. You'd need something in addition to the Sun's gravity to pull that off.

> The speed doesn't have to be much different - 366 days and earth will eventually hit asteroid - 364 days and it will eventually hit the earth.

Sure. When I said slightly-larger-than-Earth-sized orbit, I really meant it. Kepler's third law of planetary motion states (approximately) that (orbital period)^2 is proportional to (radius)^3. Assuming I did my math correctly, if your orbital period goes from 365 to 366 days your orbital radius gets ~0.18% larger, which is roughly 274000 km increase over the radius of Earth's orbit. That would fit inside the Moon's orbit (~385000 km from the Earth)!

> Ahh, Im still having a hard time figuring out why that would take more energy

At least the way I was thinking, the short answer is that one alteration to an orbit is likely to be cheaper than two, especially if you aren't particularly concerned in what manner the asteroid eventually collides with Earth.

ryan_j_naughton · 2026-03-12T22:26:14 1773354374

> There's a reason that "Rods from God" didn't pan out, and it has to do with orbital dynamics. Neither Bezos nor Musk can do it, because it actually doesn't work.

Can you say more on this? Thanks!

MisterTea · 2026-03-12T22:15:05 1773353705

> . There was a project regarding dropping telephone pole sized pieces of metal from space as an offensive weapon

I remember it was nicknamed "Rods From God". Kinetic energy weapon using 9 ton tungsten rods dropped from an orbiting platform. https://en.wikipedia.org/wiki/Kinetic_bombardment

bhhaskin · 2026-03-12T20:54:08 1773348848

The technology doesn't exist and it would be a huge waste of money.

How heavy would a telephone pole sized tungsten rod be?

What happens when China, Russia, India or Pakistan find out you are building this (cause you can't hide it if it's in near earth orbit)? They would either knock it out of the sky or hit you with everything they have. We would do the exact same if anyone else was developing such a weapon.

Ajakks · 2026-03-12T21:14:14 1773350054

I personally would get whatever metal in space, so weight is not the issue - solving this problem would also create almost immediately chunks of rocks that could also be dropped. In all reality, anything can be "setup" to be a weapon - many ways have been identified here.

All required innovations - of which, most are not out of reach in the slightest, all of that tech would be immensely valuable, literally everything we do to secure space superiority will be actual gains - not smaller microchips equivalent innovations - entirely new machines, entirely new economies of scale - there is no equivalent military tech that we can develop on earth.

Not only is there really no conceivable way to ignore the strategic advantage once considered, the long-term economic payoff is actually reason enough alone to pursue the radical idea of a "space battleship" - I can think of about 20 ways to cause significant global issues with one measly space battleship.

As a hypothetical alone, it has reason enough to warrant a substantial amount of the 1.5 trillion defense budget the Pentagon plays with.

anigbrowl · 2026-03-12T21:40:27 1773351627

If this is satire, it's not that funny. If you're serious, it's a good example of 'the ugly American.'

Ajakks · 2026-03-13T02:58:14 1773370694

I wish it were satire - we do actually need to have space defenses, asteroids exist 2st off - its just bc we ignore all the craters that we sleep at night.

We feel safe here on earth but it's really a giant graveyard trap - that so effectively exerts control over life on it, that it made all living mammals out of mice - we may actually be safer on almost any other planet.

All life on earth has eventually died out so far, we are the 1st species that could stop the most likely extinction level event - but this DART is the closest we ever got to actually taking up that responsibility - the preservation of our species and whatnot, thats just 1 minor reason.

The most important tho, given how much we have example of people "getting theirs" at all other peoples expense - this is much worse if a non-state actor gets there 1st.

Lastly, I do have to clarify the American position - we run the world, or there will not be one to run. Nobody alive today made that decision - it changes nothing, once that choice was made, we are locked into it. Did you think we are only an economic power? That is the front. We can always pivot to actual power - the kind that can destroy all cities above a certain size - we have never hid this fact, the whole world knows of MAD. That is what power is.

What is American power if someone can destroy the US and we can't destroy them?? That doesn't work for the US - nothing at all changes if the US gets that spaceship first.

You can call this ugly - there were more modern wars before we started running things, from the looks of things - the whole world will go to war the moment we are out of the picture.

m4rtink · 2026-03-12T23:22:06 1773357726

The story of Footfall is basically about that - and alien space invasion force with torch drive powered space battleship in orbit.

There are ways to battle that - balistic missile submarines for one and then "Project Michael" which would be a massive spoiler to elaborate on. ;-)

wongarsu · 2026-03-12T12:57:05 1773320225

I don't find this very compelling. If you look at the actual graph they are referencing but never showing [1] there is a clear improvement from Sonnet 3.7 -> Opus 4.0 -> Sonnet 4.5. This is just hidden in their graph because they are only looking at the number of PRs that are mergable with no human feedback whatsoever (a high standard even for humans).

And even if we were to agree that that's a reasonable standard, GPT 5 shouldn't be included. There is only one datapoint for all OpenAI models. That data point more indicative of the performance of OpenAI models (and the harness used) than of any progression. Once you exclude it it matches what you would expect from a logistic model. Improvements have slowed down, but not stopped

1: https://metr.org/assets/images/many-swe-bench-passing-prs-wo...

yorwba · 2026-03-12T13:14:42 1773321282

Yes, I think this is basically an instance of the "emergent abilities mirage." https://arxiv.org/abs/2304.15004

If you measure completion rate on a task where a single mistake can cause a failure, you won't see noticeable improvements on that metric until all potential sources of error are close to being eliminated, and then if they do get eliminated it causes a sudden large jump in performance.

That's fine if you just want to know whether the current state is good enough on your task of choice, but if you also want to predict future performance, you need to break it down into smaller components and track each of them individually.

thesz · 2026-03-13T00:04:21 1773360261

  > until all potential sources of error are close to being eliminated

This is what PSP/TSP did - one has to (continually) review its' own work to identify most frequent sources of (user facing) defects.

  >  if you also want to predict future performance, you need to break it down into smaller components and track each of them individually.

This is also one of tenets of PSP/TSP. If you have a task with estimate longer that a day (8 hours), break it down.

This is fascinating. LLM community discovers PSP/TSP rules that were laid over more than twenty years ago.

What LLM community miss is that in PSP/TSP it is an individual software developer who is responsible to figure out what they need to look after.

What I see is that it is LLM users who try to harness LLMs with what they perceive as errors. It's not that LLMs are learning, it is that users of LLMs are trying to stronghold these LLMs with prompts.

aspenmartin · 2026-03-13T13:21:36 1773408096

I don’t know it’s fair to characterize the LLM community as being ignorant and rediscovering PSP/TCP. I in fact see that as programmers rediscovering survival analysis, and most LLM folks I know have learned these perspectives from that lens. Could be wrong about PSP, maybe things are more nuanced? But what is there that isn’t already covered by foundational statistics?

maest · 2026-03-13T04:56:59 1773377819

What is PSP/TSP?

kqr · 2026-03-13T05:08:42 1773378522

One of many ways people have branded the idea of process improvement for software engineering.

Bombthecat · 2026-03-12T21:36:01 1773351361

That's how the public perceive it though.

It's useless and never gets better until it suddenly, unexpecty got good enough.

ForHackernews · 2026-03-12T22:43:18 1773355398

My robo-chauffer kept crashing into different things until one day he didn't.

Mielin · 2026-03-13T07:52:38 1773388358

Robot vacuum is allowed to crash into things and is still quite useful. You add bumpers, maybe some sort of proximity sensors to make the crash less damaging. It is safe by construction - cant harm humans because it is too small.

Things have improved a bit? Now robot shelves becomes a possibility. Map everything, use more sensors, designate humans to a particular area only. Still quite useful. It is safe by design of areas, where humans rarely walk among robots.

Improved further? Now we can do food delivery service robot. Slow down a bit, use much more sensors, think extra hard how to make it safer. Add a flag on a flagpole. Rounded body. Collisions are probably going to happen. Make the robot lighter than humans so that robot gets more damage than the human in a collision. Humans are vulnerable to falling over - make the robot hight just right to grab onto to regain balance, somewhere near waist hight.

Something like that... Now I wish this would be an actual progress requirement for a robo taxy company to do before they start releasing robo taxies onto our streets. But at least we do it as mankind, algorithm improvements, safety solutuon still benefit the whole chain. And benefit to humanity grows despite it being not quite good enough for one particular task.

roxolotl · 2026-03-12T13:05:53 1773320753

I don't know that graph to me shows Sonnet 4.5 as worse than 3.7. Maybe the automated grader is finding code breakages in 3.7 and not breaking that out? But I'd much prefer to add code that is a different style to my codebase than code that breaks other code. But even ignoring that the pass rate is almost identical between the two models.

wongarsu · 2026-03-12T11:31:20 1773315080

The 1B number would contain multiple records per person.

For example if I (as a German in Germany, ymmv) open a bank account online that involves a call with one of these companies where they take pictures and information from my passport and check that that's me. Then I choose payment in installments on some online shop, same game. Apply for a small loan? Same game. Set up an account for trading (stock exchange or crypto)? You guessed it, another call. Another payment in installments, backed by the same bank? Apparently verifying my identity again is easier than checking their database. Each of those is another record. Potentially with a new identity document, address or even name (maybe you got married) but mostly just the same data confirmed again with another timestamp

Not all of them use the same identity verification service, but there aren't that many. And I wouldn't be surprised to learn that many are the same company under different brands

wongarsu · 2026-03-12T11:00:08 1773313208

It's pretty fascinating to look at the impacts this has had in the last 2000 years, or even just the last 200.

Take construction work. Incredible improvements through power tools, gasoline-powered mobile cranes, etc. The productivity per worker has exploded. A lot of this has been captured by induced demand: we build bigger, taller, grander. But the improvements aren't distributed equally. Which means that crafts that haven't seen much improvement are now more expensive in comparison to everything else. Which has contributed to our buildings having less elaborate facades and becoming more "bland"

The same in clothing. Clothing has become dirt cheap. Even the poorest people can afford new clothing multiple times a year. But in the same transition we have gone from everything being custom tailored to most things only kind of fitting, being made for variations of the most common body shapes. Not necessarily because tailored clothing has become much more expensive (though higher labor costs from higher average productivity haven't helped), but because every other step has become cheaper and tailoring hasn't.

I wonder what we will say about the trajectory of software in a couple decades

sally_glance · 2026-03-12T18:03:35 1773338615

That's a great angle - will handcrafted software of the future become the equivalent of a tailored suit today? One might argue it already is, most companies and individuals do just fine using cloud/SaaS offerings and COTS apps. So on first glance it seems like automating software engineering would mainly benefit exactly those providers. The other side of the coin is that it also allows for cheaper/faster in-house DIY solutions and competition.

wongarsu · 2026-03-12T23:02:08 1773356528

Yeah, I could see a world where it swings exactly the opposite way for software. Writing software for yourself is becoming cheap, but gathering requirements, getting alignment between stakeholders or marketing your software isn't getting much cheaper. Maybe everyone will end up with their own in-house solution? Or maybe we end up with configurable SAP-like behemoths, but instead of an army of expensive consultants configuring the software for your use case you have AI agents taking that part

I'm sure whatever path this takes will seems obvious in hindsight

wongarsu · 2026-03-11T15:00:56 1773241256

Not in terms of people printing lego bricks. But at least as an adult, designing things in Fusion and printing them scratches a similar itch as building lego. And 3d printing is now pretty accessible to the 14+ age group. I doubt this will completely replace legos, or that it's even their biggest threat, but I'd be surprised if it had no impact

antonyh · 2026-03-11T15:28:13 1773242893

Framed that way yes, but wouldn't it be cool to 3D print interlocking parts that can be reassembled in different ways?

wongarsu · 2026-03-11T14:56:06 1773240966

> I heard your same rant in the 1980s

The two options would be that either the perception is unsubstantiated but persists, or there has been a continuous decline for the last 40 years. I'm strongly leaning towards the latter. I also having the same issues in the 00s looking at old sets from the 80s, and looking back now the 00s look much better than what we have today. Obviously not in every way, and not all recent sets were bad. But overall I have the feeling that there's been a steady trend that the bricks got better but the sets got worse

bluGill · 2026-03-11T15:17:55 1773242275

Lego was always very expensive. They have long made weird custom pieces and those sets have sold well - despite not having the long staying ability that the more basic sets have.

iso1631 · 2026-03-11T15:00:47 1773241247

Nostalgia 'aint what it used to be

wongarsu · 2026-03-11T15:03:39 1773241419

Maybe my perception of 00s models is colored by nostalgia, hard to know. But I haven't been alive in the 80s, so my perception of them during the 00s should be pretty uncolored

bombcar · 2026-03-11T16:55:45 1773248145

My recall was that the 90s was pretty awesome, and the 00s fell into BURPS and large pieces and tie-in sets.

But I think most people either agree there was a dark ages where they went almost bankrupt and did some really questionable themes, or the best time was when they were a kid.

The 90s catalogs rocked in a way that no website ever can, though.

iso1631 · 2026-03-12T10:34:04 1773311644

People have complained about things getting worse literally for millennia

Either 2000 years ago life was great, or people saying "things were better" tend to be wrong

https://historyhustle.com/2500-years-of-people-complaining-a...

Things are different, people want different things today then they did in the 80s. The top lego sets of the 80s were bought by rich parents for their kids. Adults didn't buy lego for themselves.

Today they're bought by rich adults for themselves, who want different things

Meanwhile kids aren't as bored. In the early 90s as a 10 year old I used to watch soap operas with my parents, because there wasn't much more to do at 8pm on a wet december evening. That was evidenced by Coronation street getting 17 million viewers, well over 1 in 4 people in the country.

Today the same program gets 2 million viewers, nearer to 1 in 40.

That's not because it's materially worse, but because there's more things to do.

Lego has the same issue. In the 80s as a kid there was little to do at home, so things like lego, meccano, spirograph took up the time. Today there's a lot more to do.

There's a reason 80s lego sets aren't as popular as they were in the 80s - the actual sets - and it's clearly not the quality (as a lego 80s set is the same today as it was 40 years ago).