The article says that LLMs don't summarize, only shorten, because...
"A true summary, the kind a human makes, requires outside context and reference points. Shortening just reworks the information already in the text."
Then later says...
"LLMs operate in a similar way, trading what we would call intelligence for a vast memory of nearly everything humans have ever written. It’s nearly impossible to grasp how much context this gives them to play with"
So, they can't summarize, because they lack context... but they also have an almost ungraspably large amount of context?
btown 2 hours ago [-]
It's an interesting philosophical question.
Imagine an oracle that could judge/decide, with human levels of intelligence, how relevant a given memory or piece of information is to any given situation, and that could verbosely describe which way it's relevant (spatially, conditionally, etc.).
Would such an oracle, sufficiently parallelized, be sufficient for AGI? If it could, then we could genuinely describe its output as "context," and phrase our problem as "there is still a gap in needed context, despite how much context there already is."
And an LLM that simply "shortens" that context could reach a level of AGI, because the context preparation is doing the heavy lifting.
The point I think the article is trying to make is that LLMs cannot add any information beyond the context they are given - they can only "shorten" that context.
If the lived experience necessary for human-level judgment could be encoded into that context, though... that would be an entirely different ball game.
entropicdrifter 45 minutes ago [-]
I agree with the thrust of your argument.
IMO we already have the technology for sufficient parallelization of smaller models with specific bits of context. The real issue is that models have weak/inconsistent/myopic judgement abilities, even with reasoning loops.
For instance, if I ask Cursor to fix the code for a broken test and the fix is non-trivial, it will often diagnose the problem incorrectly almost instantly, hyper-focus on what it imagines the problem is without further confirmation, implement a "fix", get a different error message while breaking more tests than it "fixed" (if it changed the result for any tests), and then declare the problem solved simply because it moved the goalposts at the start by misdiagnosing the issue.
jchw 2 hours ago [-]
I think the real takeaway is that LLMs are very good at tasks that closely resemble examples it has in its training. A lot of things written (code, movies/TV shows, etc.) are actually pretty repetitive and so you don't really need super intelligence to be able to summarize it and break it down, just good pattern matching. But, this can fall apart pretty wildly when you have something genuinely novel...
strangattractor 3 minutes ago [-]
Is anyone here aware of LLMs demonstrating an original thought? Something truly novel.
My own impression is something more akin to a natural language search query system. If I want a snippet of code to do X it does that pretty well and keeps me from having to search through poor documentation of many OSS projects. Certainly doesn't produce anything I could not do myself - so far.
Ask it about something that is currently unknown and it list a bunch of hypotheses that people have already proposed.
Ask it to write a story and you get a story similar to one you already know but with your details inserted.
I can see how this may appear to be intelligent but likely isn't.
gus_massa 2 hours ago [-]
Humans too. If I were too creative writing the midterm, most of my students would fail and everyone would be very unhappy.
BobaFloutist 57 minutes ago [-]
That's because midterms are specifically supposed to assess how well you learned the material presented (or at least directed to), not your overall ability to reason. If you teach a general reasoning class, getting creative with the midterm is one thing, but if you're teaching someone how to solve differential equations, they're learning to the very edge of their ability in a given amount of time, and you present them with an example outside of what's been described, it kind of makes sense that they can't just already solve it. I mean, that's kind of the whole premise of education, that you can't just present someone with something completely outside of their experience and expect them to derive from first principles how it works.
card_zero 1 hours ago [-]
That's exams, not humanity.
jchw 47 minutes ago [-]
I honestly think that reflects more on the state of education than it does human intelligence.
My primary assertion is that LLMs struggle to generalize concepts and ideas, hence why they need petabytes of text just to often fail basic riddles when you muck with the parameters a little bit. People get stuck on this for two reasons: one, because they have to reconcile this with what they can see LLMs are capable of, and it's just difficult to believe that all of this can be accomplished without at least intelligence as we know it; I reckon the trick here is that we simply can't even conceive of how utterly massive the training datasets for these models are. We can look at the numbers but there's no way to fully grasp just how vast it truly is. The second thing is definitely the tendency to anthropomorphize. At first I definitely felt like OpenAI was just using this as an excuse to hype their models and come up with reasons for why they can never release weights anymore; convenient. But also, you can see even engineers who genuinely understand how LLMs work coming to the conclusion that they've become sentient, even though the models they felt were sentient now feel downright stupid compared to the current state-of-the-art.
Even less sophisticated pattern matching than what humans are able to do is still very powerful, but it's obvious to me that humans are able to generalize better.
usefulcat 46 minutes ago [-]
I think "context" is being used in different ways here.
> "It’s nearly impossible to grasp how much context this gives them to play with"
Here, I think the author means something more like "all the material used to train the LLM".
> "A true summary, the kind a human makes, requires outside context and reference points."
In this case I think that "context" means something more like actual comprehension.
The author's point is that an LLM could only write something like the referenced summary by shortening other summaries present in its training set.
tovej 2 hours ago [-]
You can reconcile these points by considering what specific context is necessary. The author specifies "outside" context, and I would agree. The human context that's necessary for useful summaries is a model of semantic or "actual" relationships between concepts, while the LLM context is a model of a single kind of fuzzy relationship between concepts.
In other words the LLM does not contain the knowledge of what the words represent.
ratelimitsteve 2 hours ago [-]
I think the differentiator here might not be the context it has, but the context it has the ability to use effectively in order to derive more information about a given request.
kayodelycaon 2 hours ago [-]
They can’t summarize something that hasn’t been summarized before.
timmg 2 hours ago [-]
About a year ago, I gave a film script to an LLM and asked for a summary. It was written by a friend and there was no chance it or its summary was in the training data.
It did a really good -- surprisingly good -- job. That incident has been a reference point for me. Even if it is anecdotal.
pc86 2 hours ago [-]
I'm not as cynical as others about LLMs but it's extremely unlikely that script had multiple truly novel things in it. Broken down into sufficient small pieces it's very likely every story element was present multiple times in the LLM's training data.
I'm not sure I understand the philosophical point being made here. The LLM has "watched" a lot of movies and so understands the important parts of the original script it's presented with. Are we not describing how human media literacy works?
BobaFloutist 53 minutes ago [-]
The point is that if you made a point to write a completely novel script, with (content-wise, not semantically) 0 DNA in it from previous movie scripts, with an unambiguous but incoherent and unstructured plot, your average literate human would be able summarize what happened on the page, for all that they'd be annoyed and likely distressed by how unusual it was; but that an LLM would do a disproportionately bad job compared to how well they do at other things, which makes us reevaluate what they're actually doing and how they actually do it.
It feels like they've mastered language, but it's looking more and more like they've actually mastered canon. Which is still impressive, but very different.
38 minutes ago [-]
pc86 1 hours ago [-]
I'm not making a philosophical point. The earlier comment is "I updated a new script and it summarized it," I was simply saying the odds of that script actually being new is very slim. Even though obviously that script or summaries of it do not exist in their entirety in the training data, its individual elements almost certainly do. So it's not really a novel (pun unintended?) summarization.
naikrovek 2 hours ago [-]
they can, they just can't do it well. at no point does any LLM understand what it's doing.
kblissett 2 hours ago [-]
If you think they can't do this task well I encourage you to try feeding ChatGPT some long documents outside of its training cutoff and examining the results. I expect you'll be surprised!
kayodelycaon 2 hours ago [-]
It can produce something that looks like a summarization based on similarly matching texts.
Depending how unique the text is determines how accurate the summarization is likely to be.
Joeri 11 minutes ago [-]
LLMs mimic intelligence, but they aren’t intelligent.
They aren’t just intelligence mimics, they are people mimics, and they’re getting better at it with every generation.
Whether they are intelligent or not, whether they are people or not, it ultimately does not matter when it comes to what they can actually do, what they can actually automate. If they mimic a particular scenario or human task well enough that the job gets done, they can replace intelligence even if they are “not intelligent”.
If by now someone still isn’t convinced that LLMs can indeed automate some of those intelligence tasks, then I would argue they are not open to being convinced.
shafoshaf 5 minutes ago [-]
They can mimic well documented behavior. Applying an LLM to a novel task is where the model breaks down. This obviously has huge implications for automation. For example, most business do not have unique ways of handling accounting transactions, yet each company has a litany of AR and AP specialists who create semmingly unique SOPs. LLMs can easily automate those workers since they are simply doing a slight variation at best of a very well documented system.
Asking an LLM to take all this knowledge and apply it to a new domain? That will take a whole new paradigm.
hackyhacky 2 hours ago [-]
> LLMs mimic intelligence, but they aren’t intelligent.
I see statements like this a lot, and I find them unpersuasive because any meaningful definition of "intelligence" is not offered. What, exactly, is the property that humans (allegedly) have and LLMs (allegedly) lack, that allows one to be deemed "intelligent" and the other not?
I see two possibilities:
1. We define "intelligence" as definitionally unique to humans. For example, maybe intelligence depends on the existence of a human soul, or specific to the physical structure of the human brain. In this case, a machine (perhaps an LLM) could achieve "quacks like a duck" behavioral equality to a human mind, and yet would still be excluded from the definition of "intelligent." This definition is therefore not useful if we're interested in the ability of the machine, which it seems to me we are. LLMs are often dismissed as not "intelligent" because they work by inferring output based on learned input, but that alone cannot be a distinguishing characteristic, because that's how humans work as well.
2. We define "intelligence" in a results-oriented way. This means there must be some specific test or behavioral standard that a machine must meet in order to become intelligent. This has been the default definition for a long time, but the goal posts have shifted. Nevertheless, if you're going to disparage LLMs by calling them unintelligent, you should be able to cite a specific results-oriented failure that distinguishes them from "intelligent" humans. Note that this argument cannot refer to the LLMs' implementation or learning model.
libraryofbabel 16 minutes ago [-]
Agree. This article would had been a lot stronger if it had just concentrated on the issue of anthropomorphizing LLMs, without bringing “intelligence” into it. At this point LLMs are so good at a variety of results-oriented tasks (gold on the Mathematical Olympiad, for example) that we should either just call them intelligent or stop talking about the concept altogether.
But the problem of anthropomorphizing is real. LLMs are deeply weird machines - they’ve been fine-tuned to sound friendly and human, but behind that is something deeply alien: a huge pile of linear algebra that does not work at all like a human mind (notably, they can’t really learn form experience at all after training is complete). They don’t have bodies or even a single physical place where their mind lives (each message in a conversation might be generated on a different GPU in a different datacenter). They can fail in weird and novel ways. It’s clear that anthropomorphism here is a bad idea. Although that’s not a particularly novel point.
dkdcio 1 hours ago [-]
> I see statements like this a lot, and I find them unpersuasive because any meaningful definition of "intelligence" is not offered. What, exactly, is the property that humans (allegedly) have and LLMs (allegedly) lack, that allows one to be deemed "intelligent" and the other not?
the ability for long-term planning and, more cogently, actually living in the real world where time passes
libraryofbabel 10 minutes ago [-]
> actually living in the real world where time passes
sure, but it feels like this is just looking at what distinguishes humans from LLMs and calling that “intelligence.” I highlight this difference too when I talk about LLMs, but I don’t feel the need to follow up with “and that’s why they’re not really intelligent.”
hackyhacky 1 hours ago [-]
> the ability for long-term planning and, more cogently, actually living in the real world where time passes
1. LLMs seem to be able to plan just fine.
2. LLMs clearly cannot be "actually living" but I fail to see how that's related to intelligence per se.
dkdcio 24 minutes ago [-]
if it’s not actually living it’s not making intelligent decisions. if I make a grocery list, and go to my store, and the store isn’t there, what do I do? I make an intelligent decision about what to do next (probably investigating wtf happened, then going to the second nearest store)
my genuine question is how does a LLM handle that situation? and as you point out, it’s an absurd comparison
aDyslecticCrow 36 minutes ago [-]
Is making a list the act of planning?
card_zero 59 minutes ago [-]
It may be the case that the failures of the ability of the machine (2) are best expressed by reference to the shortcomings of its internal workings (1), and not by contrived tests.
hackyhacky 53 minutes ago [-]
It might be the case, but if those shortcomings are not visible in the results of the machine (and therefore not interpretable by a test), why do its internal workings even matter?
card_zero 44 minutes ago [-]
I'm saying best expressed. Like, you see the failures in the results, but trying to pin down exactly what's the matter with the results means you resort to a lot of handwaving and abstract complaints about generalities. So if you knew how the internals had to be that would make the difference, you could lean on that.
0x457 6 minutes ago [-]
> A philosophical exploration of free will and reality disguised as a sci-fi action film about breaking free from systems of control.
How is that a summary? It reads as a one-liner review I would leave on Letterboxed or something I would say, trying to be pretentious and treating the movie as a work of art. It is a work of art, because all movies are art, but that's an awful summary.
ticulatedspline 25 minutes ago [-]
- LLMs don't need to be intelligent to take jobs, bash scripts have replaced people.
- Even if CEOs are completely out of touch and the tool can't do the job you can still get laid off in an ill informed attempt to replace you. Then when the company doesn't fall over because the leftover people, desperate to keep covering rent fill the gaps it just looks like efficiency to the top.
- I don't think our tendency anthropomorphize LLMs is really the problem here.
nojs 1 hours ago [-]
Even stronger than our need to anthropomorphize seems to be our innate desire to believe our species is special, and that “real intelligence” couldn’t ever be replicated.
If you keep redefining real intelligence as the set of things machines can’t do, then it’s always going to be true.
safetytrick 45 minutes ago [-]
Yes, I agree, we seem to need to feel "special".
Language is really powerful, I think it's a huge part of our intelligence.
The interesting part of the article to me is the focus on fluency. I have not seen anything that LLMs do well that isn't related to powerful utilization of fluency.
ArnavAgrawal03 1 hours ago [-]
> They had known him for only 15 seconds, yet they still perceived the act of snapping him in half as violent.
This is right out of Community
vcarrico 1 hours ago [-]
I might be mixing the concepts of intelligence and conscience etc, but the human mind is more than language and data; it's also experience. LLMs have all the data and can express anything around that context, but will never experience anything, which is singular for each of us, and it's part of what makes what we call intelligence (?). So they will never replicate the human mind; they can just mimic it.
I heard from Miguel Nicolelis that language is a filter for the human mind, so you can never build a mind from language. I interpreted this like trying to build an orange from its juice.
hackyhacky 52 minutes ago [-]
> LLMs have all the data and can express anything around that context,
On the contrary, all their training data is their "experience".
andoando 12 minutes ago [-]
This, along with a ton of commentary on LLMs, seems like its written by someone who has no technical understanding of LLMs.
umanwizard 2 hours ago [-]
The article claims (without any evidence, argument or reason) that LLMs are not intelligent, then simply refuses to define intelligence.
How do you know LLMs aren't intelligent, if you can't define what that means?
22 minutes ago [-]
umanwizard 18 minutes ago [-]
Despite its title, that section does not contain a definition of intelligence.
energy123 2 hours ago [-]
It's strange seeing so many takes like this two weeks after LLMs won gold medals at IMO and IOI. The cognitive dissonance is going to be wild when it all comes to a head in two years.
aprilthird2021 2 hours ago [-]
IBM Watson won Jeopardy years ago, was it intelligent?
perching_aix 2 hours ago [-]
> Rather than being given questions, contestants are instead given general knowledge clues in the form of answers and they must identify the person, place, thing, or idea that the clue describes, phrasing each response in the form of a question. [0]
Doesn't sound like a test of intelligence to me, so no.
Why? Computers also won chess years ago, but they're not intelligent either? Why is winning a math competition intelligent but a trivia competition or a chess competition not intelligent?
umanwizard 1 hours ago [-]
Math and chess are similar in the sense that for humans, both require creativity, logical problem solving, etc.
But they are not at all similar for computers. Chess has a constrained small set of rules and it is pretty straightforward to make a machine that beats humans by brute force computation. Pre-Leela chess programs were just tree search, a hardcoded evaluation function, and lots of pruning heuristics. So those programs are really approaching the game in a fundamentally different way from strong humans, who rely much more on intuition and pattern-recognition rather than calculation. It just turns out the computer approach is actually better than the human one. Sort of like how a car can move faster than a human even though cars don’t do anything much like walking.
Math is not analogous: there’s no obvious algorithm for discovering mathematical proofs or solving difficult problems that could be implemented in a classical, pre-Gen AI computer program.
aDyslecticCrow 24 minutes ago [-]
> there’s no obvious algorithm for discovering mathematical proofs or solving difficult problems that could be implemented in a classical, pre-Gen AI computer program.
Fundamentally opposite. Computer algorithms have been part of math research since they where invented, and mathematical proof algorithms are widespread and excellent.
The llms that are now "intelligent enough to do maths" are just trained to rephrase questions into prolog code.
umanwizard 14 minutes ago [-]
> The llms that are now "intelligent enough to do maths" are just trained to rephrase questions into prolog code.
Do you have a source that talks about this?
perching_aix 1 hours ago [-]
I don't wish to join you in framing intelligence as a step function.
I think winning a Go or a chess competition does demonstrate intelligence. And winning a math competition does even more so.
I do not think a trivia competition like Jeopardy demonstrates intelligence much at all, however. Specifically because it reads like it's not about intelligence, but about knowledge: it tests for association and recall, not for performing complex logical transformations.
This isn't to say I consider these completely independent. Most smart people are both knowledgeable and intelligent. It's just that they are distinct dimensions in my opinion.
You wouldn't say something tastes bad because its texture feels weird in your mouth, would you?
tjr 19 minutes ago [-]
I might even think that a symbolic chess program is in some sense more intelligent than a modern LLM. It has a concrete model of the world it operates in along with representation what it can, cannot, and is trying to, do. When LLMs get the right answer, it seems more like... highly-optimized chance, rather than coming from any sort of factual knowledge.
aDyslecticCrow 17 minutes ago [-]
> I think winning a Go or a chess competition does demonstrate intelligence.
Chess is a simple alfa beta pruned minmax seaech tree. If that's intelligent then a drone flight controller or a calculator is aswell.
> association and recall, not for performing complex logical transformations.
By that definition humans doing chess aren't as intelligent as a computer doing chess, since high level chess is heavily reliant on memory and recall of moves and progressions.
So your definition falls appart.
im3w1l 1 hours ago [-]
None of these things are enough by itself. It's rather that they have now solved so many things that the sum total has (arguably) crossed the threshold.
As for solving math problems, that is an important part of recursive self improvement. If it can come up with better algorithms and turn them into code, that will translate into raising it's own intelligence.
krapp 2 hours ago [-]
Why do critics of LLM intelligence need to provide a definition when people who believe LLMs are intelligent only take it on faith, not having such a definition of their own?
hackyhacky 1 hours ago [-]
> Why do critics of LLM intelligence need to provide a definition when people who believe LLMs are intelligent only take it on faith, not having such a definition of their own?
Because advocates of LLMs don't use their alleged intelligence as a defense; but opponents of LLMs do use their alleged non-intelligence as an attack.
Really, whether or not the machine is "intelligent", by whatever definition, shouldn't matter. What matters is whether it is a useful tool.
aDyslecticCrow 2 minutes ago [-]
The entire argument is that thinking it's intelligent or a person makes us missuse the tool in dangerous ways. Not to make us feel better; but to not do stupid things with them.
As a tool its useful yes, that is not the issue;
- theyre used as phycologist and life coaches.
- judges of policy and law documents
- writers of life affecting computer systems.
- Judges of job applications.
- Sources of medical advice,
- legal advisors
- And increasingly as a thing to blame when any of above goes awry.
If we think of llms as very good text writing tools, the responsibility to make "intelligent" decisions and more crucially take responsibility for those decisions remains on real people rather than dice.
But if we think of them as intelligent humans, we making a fatal misjudgement.
tjr 15 minutes ago [-]
This seems reasonable. Much AI research has historically been about building computer systems to do things that otherwise require human intelligence to do. The question of "is the computer actually intelligent" has been more philosophical than practical, and many such practically useful computer systems have been developed, even before LLMs.
On the other hand, one early researcher said something to the effect of, Researchers in physics look at the universe and wonder how it all works. Researchers in biology look at living organisms and wonder how they can be alive. Researchers in artificial intelligence wonder how software can be made to wonder such things.
I feel like we are still way off from having a working solution there.
hnfong 26 minutes ago [-]
It's actually very weird to "believe" LLMs are "intelligent".
Pragmatic people see news like "LLMs achieve gold in Math Olympiad" and think "oh wow, it can do maths at that level, cool!" This gets misinterpreted by so called "critics of LLM" who scream "NO THEY ARE JUST STOCHASTIC PARROTS" at every opportunity yet refuse to define what intelligence actually is.
The average person might not get into that kind of specific detail, but they know that LLMs can do some things well but there are tasks they're not good at. What matters is what they can do, not so much whether they're "intelligent" or not. (Of course, if you ask a random person they might say LLMs are pretty smart for some tasks, but that's not the same as making a philosophical claim that they're "intelligent")
Of course there's also the AGI and singularity folks. They're kinda loony too.
Isamu 1 hours ago [-]
You can compare the current state of LLMs to the days of chess machines when they first approached grandmaster level play. The machine approach was very brute force, and there was a lot of work done to improve the sheer amount of look ahead that was required to complete at the grandmaster level.
As opposed to what grandmasters actually did, which was less look ahead and more pattern matching to strengthen the position.
Now LLMs successfully leverage pattern matching, but interestingly it is still a kind of brute force pattern matching, requiring the statistical absorption of all available texts, far more than a human absorbs in a lifetime.
This enables the LLM to interpolate an answer from the structure of the absorbed texts with reasonable statistical relevance. This is still not quite “what humans do” as it still requires brute force statistical analysis of vast amounts of text to achieve pretty good results. For example training on all available Python sources in github and elsewhere (curated to avoid bad examples) yields pretty good results, not how a human would do it, but statistically likely to be pertinent and correct.
xg15 1 hours ago [-]
I feel this article should be paired with this other one [1] that was on the frontpage a few days ago.
My impression is, there is currently one tendency to "over-anthropomorphize" LLMs and treat them like conscious or even superhuman entities (encouraged by AI tech leaders and AGI/Singularity folks) and another to oversimplify them and view them as literal Markov chains that just got lots of training data.
Maybe those articles could help guarding against both extremes.
Previously when someone called out the tendency to over-anthropomorphize LLMs, a lot of the answers amounted to, “but I like doing it, therefore we should!”
I’ll be the first to say one should pick their battles. But hearing that over and over from a crowd like this that can be quite pedantic is very telling.
kbaker 1 hours ago [-]
Seems like this is close to the Uncanny Valley effect.
LLM intelligence is in the spot where it is simultaneously genius-level but also just misses the mark a tiny bit, which really sticks out for those who have been around humans their whole lives.
I feel that, just like more modern CGI, this will slowly fade with certain techniques and you just won't notice it when talking to or interacting with AI.
Just like in his post during the whole Matrix discussion.
> "When I asked for examples, it suggested the Matrix and even gave me the “Summary” and “Shortening” text, which I then used here word for word. "
He switches in AI-written text and I bet you were reading along just the same until he pointed it out.
This is our future now I guess.
stefanv 2 hours ago [-]
What if the problem is not that we overestimate LLMs, but that we overestimate intelligence? Or to express the same idea for a more philosophically inclined audience, what if the real mistake isn’t in overestimating LLMs, but in overestimating intelligence itself by imagining it as something more than a web of patterns learned from past experiences and echoed back into the world?
pbw 2 hours ago [-]
LLM's can shorten and maybe tend to if you just say "summarize this" but you can trivially ask them to do more. I asked for a summary of Jenson's post and then offer a reflection, GPT-5 said, "It's similar to the Plato’s Cave analogy: humans see shadows (the input text) and infer deeper reality (context, intent), while LLMs either just recite shadows (shorten) or imagine creatures behind them that aren’t there (hallucinate). The “hallucination” behavior is like adding “ghosts”—false constructs that feel real but aren’t grounded.
That ain't shortening because none of that was in his post.
pitpatagain 1 hours ago [-]
I can't decide how to read your last sentence.
That reflection seems totally off to me: fluent, and flavored with elements of the article, but also not really what the article is about and a pretty weird/tortured use of the elements of the allegory of the cave, like it doesn't seem anything like Plato's Cave to me. Ironically demonstrates the actual main gist of the article if you ask me.
But maybe you meant that you think that summary is good and not textually similar to that post so demonstrating something more sophisticated than "shortening".
pbw 56 minutes ago [-]
Yes, GPT-5's response above was not shortening because there was nothing in the OP about Plato's Cave. I agree that Plato's cave analogy was confusing here. Here's a better one from GPT-5, which is deeply ironic:
A New Yorker book review often does the opposite of mere shortening. The reviewer:
* Places the book in a broader cultural, historical, or intellectual context.
* Brings in other works—sometimes reviewing two or three books together.
* Builds a thesis that connects them, so the review becomes a commentary on a whole idea-space, not just the book’s pages.
This is exactly the kind of externalized, integrative thinking Jenson says LLMs lack. The New Yorker style uses the book as a jumping-off point for an argument; an LLM “shortening” is more like reading only the blurbs and rephrasing them. In Jenson’s framing, a human summary—like a rich, multi-book New Yorker review—operates on multiple layers: it compresses, but also expands meaning by bringing in outside information and weaving a narrative. The LLM’s output is more like a stripped-down plot synopsis—it can sound polished, but it isn’t about anything beyond what’s already in the text.
pitpatagain 16 minutes ago [-]
Ah ok, you meant the second thing.
I don't think the Plato's Cave analogy is confusing, I think it's completely wrong. It's "not in the article" in the sense that it is literally not conceptually what the article is about and it's also not really what Plato's Cave is about either, just taking superficial bits of it and slotting things into it, making it doubly wrong.
pbw 38 minutes ago [-]
Essentially, Jenson's complaint is "When I ask an LLM to 'summarize' it interprets that differently from how I think of the word 'summarize' and I shouldn't have to give it more than a one-word prompt because it should infer what I'm asking for."
2 hours ago [-]
foobarian 1 hours ago [-]
The LLMs are like a Huffman codec except the context is infinite and lossy
AndrewKemendo 1 hours ago [-]
Who are you going to lodge your complaint to that the set of systems and machines that just took your job isn’t “intelligent?”
Humans seem to get wrapped around these concepts like intelligence consciousness etc. because they seem to be the only thing differentiating us from every other animal when in fact it’s all a mirage.
ChrisMarshallNY 2 hours ago [-]
That's a great article.
Scott Jenson is one of my favorite authors.
He's really big on integrating an understanding of basic human nature, into design.
codeulike 2 hours ago [-]
Well I, for one, can't beleive what that guy did to poor Timmy
beezle 53 minutes ago [-]
When I saw the post title I immediately thought of Timmy from South Park lol
3 hours ago [-]
snozolli 2 hours ago [-]
Regarding Timmy, the Companion Cube from the game Portal is the greatest example of induced anthropomorphism that I've ever experienced. If you know, you know, and if you don't, you should really play the game, since it's brilliant.
andrewla 22 minutes ago [-]
It is a brilliant game and the empathy you develop for the cube is a great concept.
But arguably much deeper is the fact that nothing in this game, or any single-player game, is a living thing in any form. Arguably the game's characterization of GLaDOS hits even harder on the anthropomorphism angle.
generationP 2 hours ago [-]
The cube doesn't work, or at least it didn't for me. The goggly eyes really do make a difference.
ChrisMarshallNY 2 hours ago [-]
I'm on a Mac, and would love to see Portal 2 (at least) ported to M-chips.
I would love Portal 3, even more.
bitwize 1 hours ago [-]
That's a matter of informed anthropomorphism. A lot of people don't become attached to the Companion Cube, but are informed that their player character is so attached.
tovej 2 hours ago [-]
Good article, it's been told before but it bears repeating.
Also I got caught on this one kind of irrelevant point regarding the characterization of the Matrix: I would say Matrix is not just diguised as a story about escaping systems of control, it's quite clearly about oppressive systems in society, with specific reference to gender expression. Lilly Wachowski has explicitly stated that it was supposed to be an allegory for gender transition.
xg15 1 hours ago [-]
It wasn't. Switch was intended to be genderfluid, but the Matrix itself, or "logging out" of it was apparently not meant as an allegory for transitioning (though she doesn't mind the interpretation) :
The character Switch was supposed to have a different gender in the matrix vs real life. It’s really a shame that didn’t happen.
altruios 2 hours ago [-]
There is nothing more free than the freedom to be who you really are.
Going to rewatch the Matrix tonight.
naikrovek 2 hours ago [-]
I've mentioned this to colleagues at work before.
LLMs give a very strong appearance of intelligence, because humans are super receptive to information provided via our native language. We often have to deal with imperfect speakers and writers, and we must infer context and missing information on our own. We do this so well that we don't know we're doing it. LLMs have perfect grammar and we subtly feel that they are extremely smart because subconsciously we recognize that we don't have to think about anything that's said, it is all syntactically perfect.
So, LLMs sort of trick us into masking their true limitations and believing that they are truly thinking; there are even models that call themselves thinking models, but they don't think, they just predict what the user is going to complain about and say that to themselves as an additional, dynamic prompt on top of the one you actually enter.
LLMs are very good at fooling us into the idea that they know anything at all; they don't. And humans are very bad at being discriminate about the source of the information presented to them if it is presented in a friendly way. The combination of those things is what has resulted in the insanely huge AI hype cycle that we are currently living in the middle of. Nearly everyone is overreacting to what LLMs actually are, and the few of us that believe that we sort of see what's actually happening are ignored for being naysayers, buzz-kills, and luddites. Shunned for not drinking the Kool-Aid.
hnfong 11 minutes ago [-]
You're ignored not because you're right or wrong, but because your unsolicited advice is not useful.
For example, I can spin up any LLM and get it to translate some English text into Japanese with maybe 99% accuracy. I don't need to believe whether it "really knows" English or Japanese, I only need to believe the output is accurate.
Similarly I can ask a LLM to code up a function that does a specific thing, and it will do it with high accuracy. Maybe there'll be some bugs, but I can review the code and fix them, which in some cases boosts my productivity. I don't need to believe whether it "really knows" C++ or Rust, I only need it to write something good enough.
I mean, just by these two examples, LLMs are really great tools, and I'm personally hyped for these use cases alone. Am I fooled by the LLM? I don't think so, I don't have any fantasy about it being extremely intelligent or always being right. I doubt most reasonable people these days would either.
So basically you're going about assuming people are fooled by LLMs (which they might not be), and wondering why you're unpopular when you're basically telling everyone they're gullible and foolish.
nataliste 45 minutes ago [-]
The author's argument is built on fallacies that always pop up in these kinds of critiques.
The "summary vs shortening" distinction is moving the goalposts. They makes the empirical claim that LLMs fail at summarizing novel PDFs without any actual evidence. For a model trained on a huge chunk of the internet, the line between "reworking existing text" and "drawing on external context" is so blurry it's practically meaningless.
Similarly, can we please retire the ELIZA and Deep Blue analogies? Comparing a modern transformer to a 1960s if-then script or a brute-force chess engine is a category error. It's a rhetorical trick to make LLMs seem less novel than they actually are.
And blaming everything on anthropomorphism is an easy out. It lets you dismiss the model's genuinely surprising capabilities by framing it as a simple flaw in human psychology. The interesting question isn't that we anthropomorphize, but why this specific technology is so effective at triggering that response from humans.
The whole piece basically boils down to: "If we define intelligence in a way that is exclusively social and human, then this non-social, non-human thing isn't intelligent." It's a circular argument.
"A true summary, the kind a human makes, requires outside context and reference points. Shortening just reworks the information already in the text."
Then later says...
"LLMs operate in a similar way, trading what we would call intelligence for a vast memory of nearly everything humans have ever written. It’s nearly impossible to grasp how much context this gives them to play with"
So, they can't summarize, because they lack context... but they also have an almost ungraspably large amount of context?
Imagine an oracle that could judge/decide, with human levels of intelligence, how relevant a given memory or piece of information is to any given situation, and that could verbosely describe which way it's relevant (spatially, conditionally, etc.).
Would such an oracle, sufficiently parallelized, be sufficient for AGI? If it could, then we could genuinely describe its output as "context," and phrase our problem as "there is still a gap in needed context, despite how much context there already is."
And an LLM that simply "shortens" that context could reach a level of AGI, because the context preparation is doing the heavy lifting.
The point I think the article is trying to make is that LLMs cannot add any information beyond the context they are given - they can only "shorten" that context.
If the lived experience necessary for human-level judgment could be encoded into that context, though... that would be an entirely different ball game.
IMO we already have the technology for sufficient parallelization of smaller models with specific bits of context. The real issue is that models have weak/inconsistent/myopic judgement abilities, even with reasoning loops.
For instance, if I ask Cursor to fix the code for a broken test and the fix is non-trivial, it will often diagnose the problem incorrectly almost instantly, hyper-focus on what it imagines the problem is without further confirmation, implement a "fix", get a different error message while breaking more tests than it "fixed" (if it changed the result for any tests), and then declare the problem solved simply because it moved the goalposts at the start by misdiagnosing the issue.
My own impression is something more akin to a natural language search query system. If I want a snippet of code to do X it does that pretty well and keeps me from having to search through poor documentation of many OSS projects. Certainly doesn't produce anything I could not do myself - so far.
Ask it about something that is currently unknown and it list a bunch of hypotheses that people have already proposed.
Ask it to write a story and you get a story similar to one you already know but with your details inserted.
I can see how this may appear to be intelligent but likely isn't.
My primary assertion is that LLMs struggle to generalize concepts and ideas, hence why they need petabytes of text just to often fail basic riddles when you muck with the parameters a little bit. People get stuck on this for two reasons: one, because they have to reconcile this with what they can see LLMs are capable of, and it's just difficult to believe that all of this can be accomplished without at least intelligence as we know it; I reckon the trick here is that we simply can't even conceive of how utterly massive the training datasets for these models are. We can look at the numbers but there's no way to fully grasp just how vast it truly is. The second thing is definitely the tendency to anthropomorphize. At first I definitely felt like OpenAI was just using this as an excuse to hype their models and come up with reasons for why they can never release weights anymore; convenient. But also, you can see even engineers who genuinely understand how LLMs work coming to the conclusion that they've become sentient, even though the models they felt were sentient now feel downright stupid compared to the current state-of-the-art.
Even less sophisticated pattern matching than what humans are able to do is still very powerful, but it's obvious to me that humans are able to generalize better.
> "It’s nearly impossible to grasp how much context this gives them to play with"
Here, I think the author means something more like "all the material used to train the LLM".
> "A true summary, the kind a human makes, requires outside context and reference points."
In this case I think that "context" means something more like actual comprehension.
The author's point is that an LLM could only write something like the referenced summary by shortening other summaries present in its training set.
In other words the LLM does not contain the knowledge of what the words represent.
It did a really good -- surprisingly good -- job. That incident has been a reference point for me. Even if it is anecdotal.
It feels like they've mastered language, but it's looking more and more like they've actually mastered canon. Which is still impressive, but very different.
Depending how unique the text is determines how accurate the summarization is likely to be.
They aren’t just intelligence mimics, they are people mimics, and they’re getting better at it with every generation.
Whether they are intelligent or not, whether they are people or not, it ultimately does not matter when it comes to what they can actually do, what they can actually automate. If they mimic a particular scenario or human task well enough that the job gets done, they can replace intelligence even if they are “not intelligent”.
If by now someone still isn’t convinced that LLMs can indeed automate some of those intelligence tasks, then I would argue they are not open to being convinced.
Asking an LLM to take all this knowledge and apply it to a new domain? That will take a whole new paradigm.
I see statements like this a lot, and I find them unpersuasive because any meaningful definition of "intelligence" is not offered. What, exactly, is the property that humans (allegedly) have and LLMs (allegedly) lack, that allows one to be deemed "intelligent" and the other not?
I see two possibilities:
1. We define "intelligence" as definitionally unique to humans. For example, maybe intelligence depends on the existence of a human soul, or specific to the physical structure of the human brain. In this case, a machine (perhaps an LLM) could achieve "quacks like a duck" behavioral equality to a human mind, and yet would still be excluded from the definition of "intelligent." This definition is therefore not useful if we're interested in the ability of the machine, which it seems to me we are. LLMs are often dismissed as not "intelligent" because they work by inferring output based on learned input, but that alone cannot be a distinguishing characteristic, because that's how humans work as well.
2. We define "intelligence" in a results-oriented way. This means there must be some specific test or behavioral standard that a machine must meet in order to become intelligent. This has been the default definition for a long time, but the goal posts have shifted. Nevertheless, if you're going to disparage LLMs by calling them unintelligent, you should be able to cite a specific results-oriented failure that distinguishes them from "intelligent" humans. Note that this argument cannot refer to the LLMs' implementation or learning model.
But the problem of anthropomorphizing is real. LLMs are deeply weird machines - they’ve been fine-tuned to sound friendly and human, but behind that is something deeply alien: a huge pile of linear algebra that does not work at all like a human mind (notably, they can’t really learn form experience at all after training is complete). They don’t have bodies or even a single physical place where their mind lives (each message in a conversation might be generated on a different GPU in a different datacenter). They can fail in weird and novel ways. It’s clear that anthropomorphism here is a bad idea. Although that’s not a particularly novel point.
the ability for long-term planning and, more cogently, actually living in the real world where time passes
sure, but it feels like this is just looking at what distinguishes humans from LLMs and calling that “intelligence.” I highlight this difference too when I talk about LLMs, but I don’t feel the need to follow up with “and that’s why they’re not really intelligent.”
1. LLMs seem to be able to plan just fine.
2. LLMs clearly cannot be "actually living" but I fail to see how that's related to intelligence per se.
my genuine question is how does a LLM handle that situation? and as you point out, it’s an absurd comparison
How is that a summary? It reads as a one-liner review I would leave on Letterboxed or something I would say, trying to be pretentious and treating the movie as a work of art. It is a work of art, because all movies are art, but that's an awful summary.
- Even if CEOs are completely out of touch and the tool can't do the job you can still get laid off in an ill informed attempt to replace you. Then when the company doesn't fall over because the leftover people, desperate to keep covering rent fill the gaps it just looks like efficiency to the top.
- I don't think our tendency anthropomorphize LLMs is really the problem here.
If you keep redefining real intelligence as the set of things machines can’t do, then it’s always going to be true.
Language is really powerful, I think it's a huge part of our intelligence.
The interesting part of the article to me is the focus on fluency. I have not seen anything that LLMs do well that isn't related to powerful utilization of fluency.
This is right out of Community
I heard from Miguel Nicolelis that language is a filter for the human mind, so you can never build a mind from language. I interpreted this like trying to build an orange from its juice.
On the contrary, all their training data is their "experience".
How do you know LLMs aren't intelligent, if you can't define what that means?
Doesn't sound like a test of intelligence to me, so no.
[0] https://en.wikipedia.org/wiki/Jeopardy!
But they are not at all similar for computers. Chess has a constrained small set of rules and it is pretty straightforward to make a machine that beats humans by brute force computation. Pre-Leela chess programs were just tree search, a hardcoded evaluation function, and lots of pruning heuristics. So those programs are really approaching the game in a fundamentally different way from strong humans, who rely much more on intuition and pattern-recognition rather than calculation. It just turns out the computer approach is actually better than the human one. Sort of like how a car can move faster than a human even though cars don’t do anything much like walking.
Math is not analogous: there’s no obvious algorithm for discovering mathematical proofs or solving difficult problems that could be implemented in a classical, pre-Gen AI computer program.
Fundamentally opposite. Computer algorithms have been part of math research since they where invented, and mathematical proof algorithms are widespread and excellent.
The llms that are now "intelligent enough to do maths" are just trained to rephrase questions into prolog code.
Do you have a source that talks about this?
I think winning a Go or a chess competition does demonstrate intelligence. And winning a math competition does even more so.
I do not think a trivia competition like Jeopardy demonstrates intelligence much at all, however. Specifically because it reads like it's not about intelligence, but about knowledge: it tests for association and recall, not for performing complex logical transformations.
This isn't to say I consider these completely independent. Most smart people are both knowledgeable and intelligent. It's just that they are distinct dimensions in my opinion.
You wouldn't say something tastes bad because its texture feels weird in your mouth, would you?
Chess is a simple alfa beta pruned minmax seaech tree. If that's intelligent then a drone flight controller or a calculator is aswell.
> association and recall, not for performing complex logical transformations.
By that definition humans doing chess aren't as intelligent as a computer doing chess, since high level chess is heavily reliant on memory and recall of moves and progressions.
So your definition falls appart.
As for solving math problems, that is an important part of recursive self improvement. If it can come up with better algorithms and turn them into code, that will translate into raising it's own intelligence.
Because advocates of LLMs don't use their alleged intelligence as a defense; but opponents of LLMs do use their alleged non-intelligence as an attack.
Really, whether or not the machine is "intelligent", by whatever definition, shouldn't matter. What matters is whether it is a useful tool.
As a tool its useful yes, that is not the issue;
- theyre used as phycologist and life coaches.
- judges of policy and law documents
- writers of life affecting computer systems.
- Judges of job applications.
- Sources of medical advice,
- legal advisors
- And increasingly as a thing to blame when any of above goes awry.
If we think of llms as very good text writing tools, the responsibility to make "intelligent" decisions and more crucially take responsibility for those decisions remains on real people rather than dice.
But if we think of them as intelligent humans, we making a fatal misjudgement.
On the other hand, one early researcher said something to the effect of, Researchers in physics look at the universe and wonder how it all works. Researchers in biology look at living organisms and wonder how they can be alive. Researchers in artificial intelligence wonder how software can be made to wonder such things.
I feel like we are still way off from having a working solution there.
Pragmatic people see news like "LLMs achieve gold in Math Olympiad" and think "oh wow, it can do maths at that level, cool!" This gets misinterpreted by so called "critics of LLM" who scream "NO THEY ARE JUST STOCHASTIC PARROTS" at every opportunity yet refuse to define what intelligence actually is.
The average person might not get into that kind of specific detail, but they know that LLMs can do some things well but there are tasks they're not good at. What matters is what they can do, not so much whether they're "intelligent" or not. (Of course, if you ask a random person they might say LLMs are pretty smart for some tasks, but that's not the same as making a philosophical claim that they're "intelligent")
Of course there's also the AGI and singularity folks. They're kinda loony too.
As opposed to what grandmasters actually did, which was less look ahead and more pattern matching to strengthen the position.
Now LLMs successfully leverage pattern matching, but interestingly it is still a kind of brute force pattern matching, requiring the statistical absorption of all available texts, far more than a human absorbs in a lifetime.
This enables the LLM to interpolate an answer from the structure of the absorbed texts with reasonable statistical relevance. This is still not quite “what humans do” as it still requires brute force statistical analysis of vast amounts of text to achieve pretty good results. For example training on all available Python sources in github and elsewhere (curated to avoid bad examples) yields pretty good results, not how a human would do it, but statistically likely to be pertinent and correct.
My impression is, there is currently one tendency to "over-anthropomorphize" LLMs and treat them like conscious or even superhuman entities (encouraged by AI tech leaders and AGI/Singularity folks) and another to oversimplify them and view them as literal Markov chains that just got lots of training data.
Maybe those articles could help guarding against both extremes.
[1] https://www.verysane.ai/p/do-we-understand-how-neural-networ...
I’ll be the first to say one should pick their battles. But hearing that over and over from a crowd like this that can be quite pedantic is very telling.
LLM intelligence is in the spot where it is simultaneously genius-level but also just misses the mark a tiny bit, which really sticks out for those who have been around humans their whole lives.
I feel that, just like more modern CGI, this will slowly fade with certain techniques and you just won't notice it when talking to or interacting with AI.
Just like in his post during the whole Matrix discussion.
> "When I asked for examples, it suggested the Matrix and even gave me the “Summary” and “Shortening” text, which I then used here word for word. "
He switches in AI-written text and I bet you were reading along just the same until he pointed it out.
This is our future now I guess.
That ain't shortening because none of that was in his post.
That reflection seems totally off to me: fluent, and flavored with elements of the article, but also not really what the article is about and a pretty weird/tortured use of the elements of the allegory of the cave, like it doesn't seem anything like Plato's Cave to me. Ironically demonstrates the actual main gist of the article if you ask me.
But maybe you meant that you think that summary is good and not textually similar to that post so demonstrating something more sophisticated than "shortening".
A New Yorker book review often does the opposite of mere shortening. The reviewer:
* Places the book in a broader cultural, historical, or intellectual context.
* Brings in other works—sometimes reviewing two or three books together.
* Builds a thesis that connects them, so the review becomes a commentary on a whole idea-space, not just the book’s pages.
This is exactly the kind of externalized, integrative thinking Jenson says LLMs lack. The New Yorker style uses the book as a jumping-off point for an argument; an LLM “shortening” is more like reading only the blurbs and rephrasing them. In Jenson’s framing, a human summary—like a rich, multi-book New Yorker review—operates on multiple layers: it compresses, but also expands meaning by bringing in outside information and weaving a narrative. The LLM’s output is more like a stripped-down plot synopsis—it can sound polished, but it isn’t about anything beyond what’s already in the text.
I don't think the Plato's Cave analogy is confusing, I think it's completely wrong. It's "not in the article" in the sense that it is literally not conceptually what the article is about and it's also not really what Plato's Cave is about either, just taking superficial bits of it and slotting things into it, making it doubly wrong.
Humans seem to get wrapped around these concepts like intelligence consciousness etc. because they seem to be the only thing differentiating us from every other animal when in fact it’s all a mirage.
Scott Jenson is one of my favorite authors.
He's really big on integrating an understanding of basic human nature, into design.
But arguably much deeper is the fact that nothing in this game, or any single-player game, is a living thing in any form. Arguably the game's characterization of GLaDOS hits even harder on the anthropomorphism angle.
I would love Portal 3, even more.
Also I got caught on this one kind of irrelevant point regarding the characterization of the Matrix: I would say Matrix is not just diguised as a story about escaping systems of control, it's quite clearly about oppressive systems in society, with specific reference to gender expression. Lilly Wachowski has explicitly stated that it was supposed to be an allegory for gender transition.
https://www.them.us/story/lilly-wachowski-work-in-progress-s...
Going to rewatch the Matrix tonight.
LLMs give a very strong appearance of intelligence, because humans are super receptive to information provided via our native language. We often have to deal with imperfect speakers and writers, and we must infer context and missing information on our own. We do this so well that we don't know we're doing it. LLMs have perfect grammar and we subtly feel that they are extremely smart because subconsciously we recognize that we don't have to think about anything that's said, it is all syntactically perfect.
So, LLMs sort of trick us into masking their true limitations and believing that they are truly thinking; there are even models that call themselves thinking models, but they don't think, they just predict what the user is going to complain about and say that to themselves as an additional, dynamic prompt on top of the one you actually enter.
LLMs are very good at fooling us into the idea that they know anything at all; they don't. And humans are very bad at being discriminate about the source of the information presented to them if it is presented in a friendly way. The combination of those things is what has resulted in the insanely huge AI hype cycle that we are currently living in the middle of. Nearly everyone is overreacting to what LLMs actually are, and the few of us that believe that we sort of see what's actually happening are ignored for being naysayers, buzz-kills, and luddites. Shunned for not drinking the Kool-Aid.
For example, I can spin up any LLM and get it to translate some English text into Japanese with maybe 99% accuracy. I don't need to believe whether it "really knows" English or Japanese, I only need to believe the output is accurate.
Similarly I can ask a LLM to code up a function that does a specific thing, and it will do it with high accuracy. Maybe there'll be some bugs, but I can review the code and fix them, which in some cases boosts my productivity. I don't need to believe whether it "really knows" C++ or Rust, I only need it to write something good enough.
I mean, just by these two examples, LLMs are really great tools, and I'm personally hyped for these use cases alone. Am I fooled by the LLM? I don't think so, I don't have any fantasy about it being extremely intelligent or always being right. I doubt most reasonable people these days would either.
So basically you're going about assuming people are fooled by LLMs (which they might not be), and wondering why you're unpopular when you're basically telling everyone they're gullible and foolish.
The "summary vs shortening" distinction is moving the goalposts. They makes the empirical claim that LLMs fail at summarizing novel PDFs without any actual evidence. For a model trained on a huge chunk of the internet, the line between "reworking existing text" and "drawing on external context" is so blurry it's practically meaningless.
Similarly, can we please retire the ELIZA and Deep Blue analogies? Comparing a modern transformer to a 1960s if-then script or a brute-force chess engine is a category error. It's a rhetorical trick to make LLMs seem less novel than they actually are.
And blaming everything on anthropomorphism is an easy out. It lets you dismiss the model's genuinely surprising capabilities by framing it as a simple flaw in human psychology. The interesting question isn't that we anthropomorphize, but why this specific technology is so effective at triggering that response from humans.
The whole piece basically boils down to: "If we define intelligence in a way that is exclusively social and human, then this non-social, non-human thing isn't intelligent." It's a circular argument.