How are people feeling with regards to GPT-4o versus Claude 3.5 Sonnet? I recently watched this Primeagen video [0] about how, because LLMs don't actually understand anything (yes, AI Effect included [1]), one does not actually gain as much usefulness as they'd expect, especially with subtly wrong outputs. Over time, it just wastes way more time and becomes a form of learned helplessness (and yes, I do know about Socrates' dialogue, I saw it originally elsewhere on HN and had been quoting it for some time [2]).
If you follow [3], yes, even learning to read and write is a form of learned helplessness, as Thoth and Socrates conclude. Now, you might not thing that affects our day to day world, but it does; because imagine if we could not read nor write, what sorts of hypotheses we might come up with, now extend that to what we say about LLMs. To those that say those are inequal circumstances, I invite you prove why. Thoth and Socrates know a hell of a lot more than you ever did, to be frank.
You're doing a bit of a logical fallacy here where you're framing the idea of "learned helplessness" as a bad thing intrinsic to the use of LLMs, but then backpedaling to suggest that writing itself is also learned helplessness. And the latter I'm fine to agree with, but it changes the terms under which you made your comment in the first place to poo poo them. If your point is that technological advances cede some part of humanity to technology, sure, but by nature of that point this isn't relevant commentary on LLMs outside the fact they too participate in that lineage, which, okay, but that isn't really saying anything besides "LLMs are technology".
I recently made this blog post showing how y=mx+b is very error prone in GPT4o, and pretty accurate (to a point) in Claude 3.5.
I haven't gone down the rabbit hole yet, but I was wondering if fine tuning could fix math errors in LLMs. My initial hunch and understanding is it will not. I'll have to give your links a read/watch.
I'll also explain my downvote, it's because of the assertion that LLMs "don't actually understand anything", which to anyone who's actually successfully used LLMs to solve a difficult problem is clearly false, unless you use some contrived definition of the word "understand" that doesn't match how the word is actually used in normal conversation.
I am referring specifically to what was claimed in the video I quoted, so, unless you watched the actual video, what you are saying has not much bearing on what I am actually saying. Sorry to say it, and not to be harsh, but I constructed my comment to specifically point to such an instance via bracket quotes, please respond to what was said in said video.
Asking people to watch a 30m video in order to understand your comment isn't reasonable - can you summarize the point from the video that you're arguing here so people can respond to it without putting in all of that extra work?
Well, I did; LLMs are not necessarily intelligent enough to not cause new problems in terms of the solutions they produce. This is a fundamental flaw of LLMs that is covered even by mainstream media, much less the AI Effect as shown by Wikipedia. At worst, they might turn a 0.1x engineer into a 10x one, ie a 1x one, except with no ability to actually solve problems cohesively.
I'm an experienced engineer and I've seen what I estimate to be a 2-5x productivity improvement in the time I spend typing code into my computer from embracing LLM-assisted development.
Typing-in-code is only 10% of the work that I do, but this is still a very meaningful improvement for me.
The bigger ones have gained a rough understanding of a few systems [1]. Which is really impressive and gives an answer to the Chinese Room experiment. In my experience they don’t understand a lot of things I ask about very well. But the fact that they understand anything at all is impressive.
If five years ago someone said that in half a decade we'd have a computer program that could solve medium-complexity Leetcode problems that it had never seen before, hardly anyone would believe them. Now we have programs that can do exactly this, and yet some people never miss a chance to try to trivialise what just a few years ago would have been considered an amazing, world-changing achievement.
Can it though? My understanding is that ChatGPT has all the Leetcode problems memorized, maybe it can extrapolate to substantially similar ones in its training set.
I tried it for advent of code 2023, and it was pretty helpless.
I feel like this comment was made in good faith so I wanted to explain my downvote: I think it's just too far off-topic for this particular announcement. It's better as a separate discussion.
[0] https://www.youtube.com/watch?v=1-hk3JaGlSU
[1] https://en.wikipedia.org/wiki/AI_effect
[2] https://news.ycombinator.com/item?id=40920318