> We really have no idea how did ability to have a conversation emerge from predicting the next token.
Maybe you don't. To be clear, this is benefiting massively from hindsight, just as how if I didn't know that combustion engines worked, I probably wouldn't have dreamed up how to make one, but the emergent conversational capabilities from LLMs are pretty obvious. In a massive dataset of human writing, the answer to a question is by far the most common thing to follow a question. A normal conversational reply is the most common thing to follow a conversation opener. While impressive, these things aren't magic.
>In a massive dataset of human writing, the answer to a question is by far the most common thing to follow a question.
No it isn't. Type a question into a base model, one that hasn't been finetuned into being a chatbot, and the predicted continuation will be all sorts of crap, but very often another question, or a framing that positions the original question as rhetorical in order to make a point. Untuned raw language models have an incredible flair for suddenly and unexpectedly shifting context - it might output an answer to your question, then suddenly decide that the entire thing is part of some internet flamewar and generate a completely contradictory answer, complete with insults to the first poster. It's less like talking with an AI and more like opening random pages in Borge's infinite library.
To get a base language model to behave reliably like a chatbot, you have to explicitly feed it "a transcript of a dialogue between a human and an AI chatbot", and allow the language model to imagine what a helpful chatbot would say (and take control during the human parts). The fact that this works - that a mere statistical predictive language model bootstraps into a whole persona merely because you declared that it should, in natural English - well, I still see that as a pretty "magic" trick.
>No it isn't. Type a question into a base model, one that hasn't been finetuned into being a chatbot, and the predicted continuation will be all sorts of crap, but very often another question, or a framing that positions the original question as rhetorical in order to make a point.....
To be fair, only if you pose this question singularly with no proceeding context. If you want the raw LLM to answer your question(s) reliably then you can have the context prepended with other question-answer pairs and it works fine. A raw LLM is already capable of being a chatbot or anything else with the right preceding context.
Right, but that was my point - statistically, answers do not follow questions without some establishing context, and as such, while LLMs are "simply" next word predictors, the chatbots aren't - they are Hofstaderian strange loops that we will into being. The simpler you think language models are, the more that should seem "magic".
They're not simple though. You can understand, in a reductionist sense, the basic principles of how transformers perform function approximation; but that does not grant an intuitive sense of the nature of the specific function they have been trained to approximate, or how they have achieved this approximation. We have little insight into what abstract concepts each of the many billions of parameters map on to. Progress on introspecting these networks has been a lot slower than trial-and-error improvements. So there is a very real sense in which we have no idea how LLMs work, and they are literally "magic black boxes".
No matter how you slice it - if "magic" is a word which can ever be applied to software, LLM chatbots are sure as shit magic.
If such a simplistic explanation was true, LLM's would only be able to answer things that had been asked before, and where at least a 'fuzzy' textual question/answer match was available. This is clearly not the case. In practice you can prompt the LLM with such a large number of constraints, so large that the combinatorial explosion ensures no one asked that before. And you will still get a relevant answer combining all of those. Think combinations of features in a software request - including making some module that fits into your existing system (for which you have provided source) along with a list of requested features. Or questions you form based on a number of life experiences and interests that combined are unique to you. You can switch programming language, human language, writing styles, levels as you wish and discuss it in super esoteric languages or morse code. So are we to believe this answers appear just because there happened to be similar questions in the training data where a suitable answer followed? Even if for the sake of argument we accept this explanation by "proximity of question/answer", it is immediately that this would have to rely on extreme levels of abstraction and mixing and matching going on inside the LLM. And that it is then this process that we need to explain how works, whereas the textual proximity you invoke relies on this rather than explaining it.
I think you're confusing OP for the people who claim that there is zero functional difference between an LLM and a search engine that just parrots stuff already in it. But they never made such a claim. Here, let me try: the simplest explanation for how next token estimation leads to a model that often produces true answers is that for most inputs, the most likely next token is true. Given their size and the way they're trained, LLMs obviously don't just ingest training data like a big archive, they contain something like an abstract representation of tokens and concepts. While not exactly like human knowledge, the network is large and deep enough that LLMs are capable of predicting true statements based on preceding text. This also enables them to answer questions not in their training dataset, although accuracy obviously suffers the further you deviate from known topics. The most likely next token to any question is the true answer, so they essentially ended up being trained to estimate truth.
I'm not saying this is bad or underwhelming, by the way. It's incredible how far people were able to push machine learning with just the knowledge we have now, and how they're still making process. I'm just saying it's not magic. It's not something like an unsolved problem in mathematics.
No one ever made the claim it was magic, not even remotely. Regarding the rest of your commentary: a) The original claim was that LLM's were not understood and are a black box. b) Then someone claims that this is not true, and they know well how LLM's work, it is simply due to questions & answers being in close textual proximity in training data. c) I then claim this is a shallow explanation because you then need to invoke additionally a huge abstraction network - that is a black box, d) you seem to agree with this while at the same time saying I misrepresented "b" - which I don't think I did. They really claimed they understood it and only offered this textual proximity thing.
In general, every attempt at explanation of LLM's that appeal to "[just] predicting next token" is thought terminating and automatically invalid as explanation. Why? Because it is confusing the objective function with the result. It adds exactly zero over saying "I know how a chess engine works, it just predicts the next move and has been trained to predict the next move" or "A talking human just predicts the next word, as it was trained to do". It says zero about how this is done internally in the model. You could have a physical black box predicting the next token, and inside you could have simple frequentist tables or you could have a human brain or you could have an LLM. In all cases you could say the box is predicting the next token and if any training was involved you could say it was trained to predict the next token.
My best friend who has literally written a doctorate on artificial intelligence doesn't. If you do, please write a paper on it, and email it to me. My friend would be thrilled to read it.
I don't know much about this space other than a user of Claude and a Electrical Engineering background...
However reading some Standford study summaries (not the whole thing) and just generally where AI research is now, it's clear that researchers can't deterministically say exactly how the black box works.
So yet gain, HN armchair scientists are no better than any other topic. I love reading comments here, but so many people have opinions on things that aren't well founded.
>In a massive dataset of human writing, the answer to a question is by far the most common thing to follow a question. A normal conversational reply is the most common thing to follow a conversation opener. While impressive, these things aren't magic.
Obviously, that's the objective, but who's to say you'll reach a goal just because you set it ? And more importantly, who's the say you have any idea how the goal has actually been achieved ?
You don't need to think LLMs are magic to understand we have very little idea of what is going on inside the box.
We know exactly what is going on inside the box. The problem isn't knowing what is going on inside the box, the problem is that it's all binary arithmetic & no human being evolved to make sense of binary arithmetic so it seems like magic to you when in reality it's nothing more than a circuit w/ billions of logic gates.
We do not know or understand even a tiny fraction of the algorithms and processes a Large Language Model employs to answer any given question. We simply don't. Ironically, only the people who understand things the least think we do.
Your comment about 'binary arithmetic' and 'billions of logic gates' is just nonsense.
I think the fallacy at hand is more along the lines of "no true scotsman".
You can define understanding to require such detail that nobody can claim it; you can define understanding to be so trivial that everyone can claim it.
"Why does the sun rise?" Is it enough to understand that the Earth revolves around the sun, or do you need to understand quantum gravity?
Good point. OP was saying "no one knows" when in fact plenty of people do know but people also often conflate knowing & understanding w/o realizing that's what they're doing. People who have studied programming, electrical engineering, ultraviolet lithography, quantum mechanics, & so on know what is going on inside the computer but that's different from saying they understand billions of transistors b/c no one really understands billions of transistors even though a single transistor is understood well enough to be manufactured in large enough quantities that almost anyone who wants to can have the equivalent of a supercomputer in their pocket for less than $1k: https://www.youtube.com/watch?v=MiUHjLxm3V0.
Somewhere along the way from one transistor to a few billion human understanding stops but we still know how it was all assembled together to perform boolean arithmetic operations.
With LLMs, The "knowing" you're describing is trivial and doesn't really constitute knowing at all. It's just the physics of the substrate. When people say LLMs are a black box, they aren't talking about the hardware or the fact that it's "math all the way down." They are talking about interpretability.
If I hand you a 175-billion parameter tensor, your 'knowledge' of logic gates doesn't help you explain why a specific circuit within that model represents "the concept of justice" or how it decided to pivot a sentence in a specific direction.
On the other hand, the very professions you cited rely on interpretability. A civil engineer doesn't look at a bridge and dismiss it as "a collection of atoms" unable to go further. They can point to a specific truss and explain exactly how it manages tension and compression, tell you why it could collapse in certain conditions. A software engineer can step through a debugger and tell you why a specific if statement triggered.
We don't even have that much for LLMs so why would you say we have an idea of what's going on ?
It sounds like you're looking for something more than the simple reality that the math is what's going on. It's a complex system that can't simply be debugged through[1], but that doesn't mean it isn't "understood".
This reminds me of Searle's insipid Chinese Room; the rebuttal (which he never had an answer for) is that "the room understands Chinese". It's just not satisfying to someone steeped in cultural traditions that see people as "souls". But the room understands Chinese; the LLM understands language. It is what it is.
[1] Since it's deterministic, it certainly can be debugged through, but you probably don't have the patience to step through trillions of operations. That's not the technology's fault.
>It sounds like you're looking for something more than the simple reality that the math is what's going on.
Train a tiny transformer on addition pairs (i.e i.e '38393 + 79628 = 118021') and it will learn an algorithm for addition to minimize next token error. This is not immediately obvious. You won't be able to just look at the matrix multiplications and see what addition implementation it subscribes to but we know this from tedious interpretability research on the features of the model. See, this addition transformer is an example of a model we do understand.
So those inscrutable matrix multiplications do have underlying meaning and multiple interpretability papers have alluded as much, even if we don't understand it 99% of the time.
I'm very fine with simply saying 'LLMs understand Language' and calling it a day. I don't care for Searle's Chinese Room either. What I'm not going to tell you is that we understand how LLMs understand language.
Your ultra-reductionism does not not constitute understanding. "Math happens and that somehow leads to a conversational AI" is true, but it is not useful. You cannot use it to answer questions like "how should I prompt the model to achieve <x>". There are many layers of abstraction within the network - important, predictive abstractions - which you have no concept of. It is as useful as asking a particle physicist why your girlfriend left you, because she is made of atoms.
Incidentally, your description of LLMs also describes all software, ever. It's just math, man! That doesn't make you an expert kernel hacker.
It sounds like you're looking for the field of psychology. And like the field of psychology, any predictive abstraction around systems this complicated will be tenuous, statistical, and full of bad science.
You may never get a scientific answer to "how should I prompt the model to achieve <x>", just like you may never get a capital-S scientific answer to "how should I convince people to do X". What would it even mean to "understand people" like this?
No one relies on "interpretability" in quantum mechanics. It is famously uninterpretable. In any case, I don't think any further engagement is going to be productive for anyone here so I'm dropping out of this thread. Good luck.
Quantum mechanics has competing interpretations (Copenhagen, Many-Worlds, etc.) about what the math means philosophically, but we still have precise mathematical models that let us predict outcomes and engineer devices.
Again, we lack even this much with LLMs so why say we know how they work ?
Unless I'm missing what you mean by a mile, this isn't true at all. We have infinitely precise models for the outcomes of LLMs because they're digital. We are also able to engineer them pretty effectively.
The ML Research world (so this isn't simply a matter of being ignorant/uninformed) was surprised by the performance of GPT-2 and utterly shocked by GPT-3. Why ? Isn't that strange ? Did the transformer architecture fundamentally change between these releases ? No, it did not at all.
So why ? Because even in 2026, nevermind 18 and 19, the only way to really know exactly how a neural network will perform trained with x data at y scale is to train it and see. No elaborate "laws", no neat equations. Modern Artificial Intelligence is an extremely empirical, trial and error field, with researchers often giving post-hoc rationalizations for architectural decisions. So no, we do not have any precise models that tell us how a LLM will respond to any query. If we did, we wouldn't need to spend months and millions of dollars training them.
We don't have a model for how an LLM that doesn't exist will respond to a specific query. That's different from lacking insight at all. For an LLM that exists it's still hard to interpret but it's very clear what is actually happening. That's better than you often get with quantum physics when there's a bunch of particles and you can't even get a good answer for the math.
And even for potential LLMs, there are some pretty good extrapolations for overall answer quality based on the amount of data and the amount of training.
>We don't have a model for how an LLM that doesn't exist will respond to a specific query.
We don't have a model for a LLM that does exist will respond to a specific query either.
>For an LLM that exists it's still hard to interpret but it's very clear what is actually happening.
No, it's not and I'm getting tired of explaining this. If you think it is, write your paper and get very rich.
>That's better than you often get with quantum physics when there's a bunch of particles and you can't even get a good answer for the math.
You clearly don't understand any of this.
>And even for potential LLMs, there are some pretty good extrapolations for overall answer quality based on the amount of data and the amount of training.
> We don't have a model for a LLM that does exist will respond to a specific query either.
Yes we do... It's math, you can calculate it.
> No, it's not and I'm getting tired of explaining this. If you think it is, write your paper and get very rich.
Why would I get rich for explaining how to do math?
> You clearly don't understand any of this.
Could you be more specific?
Quantum physics is stupidly hard to calculate when you approach realistic situations.
A real LLM takes a GPU a fraction of a second.
They're both hard to interpret, please realize I'm agreeing that LLMs are hard to interpret. But they're easier than QM on some other fronts.
And mentioning copenhagen or many-worlds doesn't show that quantum mechanics are easy to interpret, that's about as useful as saying an LLM works like neuron activation.
Maybe you don't. To be clear, this is benefiting massively from hindsight, just as how if I didn't know that combustion engines worked, I probably wouldn't have dreamed up how to make one, but the emergent conversational capabilities from LLMs are pretty obvious. In a massive dataset of human writing, the answer to a question is by far the most common thing to follow a question. A normal conversational reply is the most common thing to follow a conversation opener. While impressive, these things aren't magic.