Well, hang on a second - it sounds like you may actually disagree with the user who created this thread. That user claims that these systems exhibit “real intelligence”, and success on this Erdos problem is proof.
You seem to be making the claim that LLMs are statistical text generators, but statistical text generation is good enough to succeed in certain cases. Those are different arguments. What do you actually believe? Are we even in disagreement?
I don't have any opinion about "real intelligence" or not. I'm not a P(doom)er, I don't think we're on the bring of ascending as a species. But I'm also allergic to arguments like "they're just statistical text generators", because that truly does not capture what these things do or what their capabilities are.
(The clearer way for me to have said this is that I don't care whether they're According-to-Hoyle "intelligent", and that controversy isn't what motivated me to comment).
"But I'm also allergic to arguments like "they're just statistical text generators", because that truly does not capture what these things do or what their capabilities are."
Umm, why doesn't it capture it? Why can't a statistical text generator do amazing things without _actually_ being intelligent (I'm thinking agency here)? I think it's important to remind ourselves, these things do not reflect or understand what they're outputting. That is 100% evident with the continuing issues with them outputting nonsense along with their apparently insightful output. The article itself said the output was poor but the student noticed something about it that sparked an idea and he followed that lead.
I reject the premise. I read the outputs I generate carefully (too carefully, probably). They don't "continue to output nonsense". Their success rate exceeds that of humans in some places.
To clarify: the problem I have with "statistical text generator" isn't the word "statistical". It's "text generator". It's been two years now since that stopped being a reasonable way to completely encapsulate what these systems do. The models themselves are now run iteratively, with an initial human-defined prompt cascading into series of LLM-generated interim prompts and tool calls. That process is not purely, or even primarily, one of "text generation"; it's bidirectional, and involves deep implicit searches.
Do you think it's akin to Ilya's [1] claim that next token prediction is reality? E.g. any deeper claims about the structure of that intelligence or comparing to humans?
To be clear, I'm 100% with you that "next token predictor" is stupid to call what these machines are now. We are engineers and can shape the capability landscape to give rise to a ton of emergent behavior. It's kind of amazing. In that sense, being precise about what's going on, rather than being essentialist (technically, yes, the 'actual' algorithm, whatever that even means, is text prediction), is just good epistemology.
I still think it's still a very interesting question though to ask about deeper emergent structures. To me, this is evidence of a more embedded cognition kind of theory of intelligence (admittedly this is not very precise). But IDK how into philosophy you are.
I try really hard not to think about this stuff because I've seen how people talk when they get too deep into it. My mental model, or mental superstructure, if you will, for all of this stuff is that we've discovered a fundamentally novel and effective way of doing computing. Computer science is fascinating and I'm there for it, and prickly when people are dismissive of it. I'm generally not interested in the theory of human intelligence (it's a super interesting problem I just happen not to engage with much), which spares me from a lot of crazy Internet stuff.
Just to clarify because I’m not sure I understand:
So you agree that LLMs are in fact statistical text generators but you don’t like people use that fact in arguments about the capabilities of the things?
Not parent but I think you're being rather dense. They are _obviously_ statistical text generators. There's plenty of source code out there, anyone can go and inspect it and see for themselves so disputing that is akin to disputing the details of basic arithmetic.
But it is no longer useful to bring that fact up when conversing about their capabilities. Saying "well it's a statistical text generator so ..." is approximately as useful as saying "well it's made of atoms so ...". There are probably some very niche circumstances under which statements of each of those forms is useful but by and large they are not and you can safely ignore anyone who utters them.
It is still important to mention that because atoms have limitations and so do statistical generators. Plain and simple. People are walking around thinking organic brains are just statistical generators and they're gonna build AGI with GPUs. It's absurd.
And your evidence for these claimed limitations is ... ? I'm not aware of evidence either for or against organic brains being "just" statistical generators. Neither am I aware of evidence either for or against AGI being possible to achieve using GPUs. AFAICT you're just making things up.
You seem to be making the claim that LLMs are statistical text generators, but statistical text generation is good enough to succeed in certain cases. Those are different arguments. What do you actually believe? Are we even in disagreement?