Well no, the idea is a tradeoff between interfaces and telemetry.
OK, the agents don't click in the same way as humans. You learn that, what about mouse hovering telemetry, time spent, etc. And one of the most extreme is to force biometrics - a lot of telemetry, breaks the interface a lot - but hey, you have assurance.
And none of these tradeoffs require understanding the deep processes of the human mind. Just, map is not the territory, how you do game the map harder and harder and how do the mapmakers respond to that?
LLMs can solve original math problems at the IMO level and beyond, and you might be talking to one now. I don't think they are going to have problems with any CAPTCHA short of separate device attestation.
Whatever mechanism the paper proposes, rest assured it can be trained on.
5.2 still had a Codex variant, which this doesn't describe using. It also notably is not using the Codex harness -- it does everything with open source harnesses (which obviously are worse). And while it uses two harnesses with its cheap models, it only uses the worse-performing one of those with GPT 5.2 for cost reasons. (They also don't specify effort/thinking level used for GPT 5.2, but given that it performs worse in their baseline testing than obviously non-SOTA models, I'm guessing it wasn't set to anything high.)
What would unsupervised mean, would unsupervised be something like alphago playing against itself trillions of times?
Whereas self-supervised, allows learning without explicit annotation of data ; but it doesn't matter if the models already trained on the entire Internet, and it's not like a game where it can come up with effectively new training data for itself?
Unsupervised is basically clustering. Alphago is RL - winning or losing a game is a form of supervision.
Unsupervised is something where there is no intrinsic reward signal. In pre training, predicting the next token and seeing that it matches is a reward signal, hence it is self supervised.
Everything is an accident, an anecdote, only trust the state with your authoritative quantitative data! There's surely no philosophical issues with that! There's no issues with definitional authority!
And this is one of the many issues with invoking the logical positivists here...
I'm not even sure why they were invoked. Even disregarding the big techinical debunks such as two dogmas, sociologically and even by talking to real mathematicians (see Lakatos, historically, but this is true anecdotally too), it's (ironically) a complete non-question to wonder about mathematics in a logical positivist way.
Sorry, I thought this would've been a method for finding invariants, not one just for expressing them? I guess I should think about TLA+ as ultimately some kind of solver - give it a configuration, it tells me if it's defined well or not, the point is to make sure I'm not making mistakes, but not necessarily automated innovation?
I've been working through the American pragmatist tradition - James, Pierce, Sellars, Rorty, Brandom.
Something about that background, all the discussions about definitions and representation, the original article talking about dualisms.... It's certainly an experience.
Sorry wasn't there a post literally like a week ago about this being a long term experimental branch and how we needed to not kick the hatchling while it's an egg?
I've noticed on hackernews in the past year, a certain type of comment. A deep suspicion to first call out a surface behavior, then psychoanalyze strangers with whatever the flavor of the month "deep observation" is.
You can't be a dick on this platform without fancy prose I guess.
OK, the agents don't click in the same way as humans. You learn that, what about mouse hovering telemetry, time spent, etc. And one of the most extreme is to force biometrics - a lot of telemetry, breaks the interface a lot - but hey, you have assurance.
And none of these tradeoffs require understanding the deep processes of the human mind. Just, map is not the territory, how you do game the map harder and harder and how do the mapmakers respond to that?
reply