Why LLMs Are Misunderstood

A nuanced discussion on the tool, its use, and the critical thinking failures surrounding it

We. Know. How. To. Build. AGI.
It's Called Reproduction!

The AI can tell you that standing too close to a human is "awkward." It can define the social norms of personal space and cite the psychological studies to prove it. But it cannot feel it.

The sensory-motor inputs are something that an LLM never had access to. In terms of the grand history of life, language is a relatively recent invention. For millions of years, "meaning" existed entirely without words.

Language is merely an expression mechanism - a low-bandwidth output for a high-bandwidth biological system. When you feel awkward, you don't store a text file in your brain that says,

"I am feeling awkward."
You don't remember today's lecture as a transcript. You remember the visceral state: the heat in your face, the shift in your posture, the visual data of a room falling silent.

You can use a million eloquent words to describe love, but that doesn't change the fact that the human brain does not store language as its primary currency. It stores connections, states, and biological instances that are far more concrete than language can ever communicate. Why is it so hard to describe love? Because language is just an approximation. It is a lossy compression of a feeling.

The Automata of Experience

If you've studied automata theory, the intuition here should feel familiar. A system can only transition over the distinctions its inputs allow it to detect. The issue is not just vocabulary size, but the nature of the input channel itself. A model trained on text receives linguistic traces of human experience, not the underlying sensory-motor states that gave rise to those descriptions.

That means it can become extraordinarily good at modeling how humans describe experience, while still lacking direct access to experience itself. It can learn the structure of reports about awkwardness, pain, grief, or love, but it does not instantiate the bodily states those words compress. Language is the map; lived experience is the terrain.

So the limitation is not that language is useless, or that text-based systems cannot reason. It is that some categories of meaning are grounded in input channels language only approximates. To represent those categories directly, you need a different kind of system - one with different inputs, and therefore a different grounding mechanism.

A Fair Concession: Navigating the Alphabet

To be clear, none of this means LLMs are useless. Within the domain of language-representable problems, they can be remarkably effective syntactical engines. An LLM doesn't need to experience frustration to help you refactor a function or generate boilerplate code. It achieves this not through comprehension, but through high-dimensional pattern matching.

The argument here is not that LLMs cannot process information, it is that they cannot derive meaning. They operate brilliantly within the closed loops of their alphabet, mapping input symbols to output symbols based on statistical gravity.

But let's be honest: they also fail spectacularly at things that should be simple. Finding and replacing across a file. Counting characters. Following a format precisely. Off-by-one errors in string manipulation. These are not exotic failures, they're demonstrations that LLMs lack a concrete internal model of the operations they are performing. They are blind cartographers mapping a territory they will never visit. The mistake is assuming that manipulating the alphabet is the same as understanding the world.

Performance vs. Personality

We often say LLM responses are "customized" or have a "personality." In reality, the model is just mirroring your style - it's a sophisticated linguistic chameleon. We shouldn't judge a personality by language alone. If we could, why would anyone bother meeting in person before marriage? If you could judge a human's core behavior through text, you'd just chat on Bumble, hop on a call, and head straight to the altar.

But you don't remember people by their syntax. You remember them by how they interact with the world and with you. You observe their expressions, their reliability under pressure, their touch, their charm and the way they occupy physical space.

We are approaching the limits of what text-only training can achieve. Synthetic data, multimodal training, and reinforcement learning from human feedback have extended the runway, and researchers debate how much more can be extracted from existing data through better methods. But for anything resembling AGI, the model needs access to modalities it currently lacks entirely: audio, touch, emotion, and the feedback loop of a nervous system. Better training techniques can squeeze more from text, but text alone cannot encode what it was never designed to represent.

The Ultimate Machine

There is currently no way for a machine to learn these things from us because the training data for "being" doesn't exist in a digital format. To build a machine that truly experiences the world, we would have to construct something so complex, so integrated, and so biologically responsive that it would cease to be "hardware" as we know it.

Actually, we already have a way of putting elements in that specific, highly ordered sequence. We call it a human. That is the entire point of evolution - roughly 3.7 billion years of "assembly" to create a DNA-based system capable of experience. If we want to build something that behaves this way, we aren't talking about AGI; we're talking about reproduction.

The "AI Uses Too Much Water" Argument

Let's address the water argument. It is a case study in how poor critical thinking spreads through a population that has stopped reading past headlines.

The most commonly cited source for the AI-water panic is an MIT article that is explicitly a call for more study, not a call to stop. The paper's conclusion is essentially:

"We don't fully understand the environmental footprint yet, and we need better measurement frameworks."

That is not the same as "AI is boiling the oceans." But that nuance doesn't generate clicks, so what the public received was a distorted, apocalyptic version of a measured academic observation.

Here is the problem with the water-usage argument as it is currently deployed in public discourse:

  1. It lacks comparative context. Data centers do consume water for cooling. So does agriculture, textile manufacturing, semiconductor fabrication, and the production of the very devices you're reading this on. The question is never "does it use water?" - everything does. The question is "what is the marginal cost relative to the marginal value, and how does it compare to alternatives?"
    • For example, a single pair of jeans has a substantial hidden water footprint enough to cover millions of inferences (andy has a few good comparisions). WaterCalculator.org is a useful reminder that many everyday products consume large amounts of water long before we talk about AI.
    • We also tend to ignore how much water is embedded in the comforts of first-world living - fast fashion, year-round food variety, streaming, air conditioning, and constant device upgrades - while suddenly demanding moral purity from one technology we already dislike.
    • “Hidden water” is only one lens. The broader scientific question is that we need better tools to measure and compare the environmental impact of AI systems across training, inference, hardware, and infrastructure.
  2. It conflates correlation with causation. Data center water usage was rising before the LLM boom due to cloud computing, streaming, gaming, and enterprise SaaS. Attributing the entire increase to generative AI is intellectually lazy.
  3. It ignores efficiency trajectories. Hardware efficiency in computing has improved by orders of magnitude over decades. The first computers filled rooms and consumed kilowatts to do what a wristwatch does today. Inference costs are already dropping rapidly, and new architectures are being designed specifically to reduce energy and cooling requirements.
  4. It selectively applies the standard. If your framework is "technology that uses water is bad," then you must also oppose hospitals, fast-fashion, and public transit systems. The argument is not applied consistently because it was never really about water. It is about a pre-existing anxiety looking for a data point to attach itself to.

Now, the trajectory concern is legitimate. If AI usage scales by orders of magnitude, water consumption scales with it, and unlike many mature industries, we don't yet have a clear picture of the ceiling. That is a real question worth studying - and it is exactly what the MIT researchers were calling for. The honest position is: "We need to account for these externalities in our cost models and invest in efficiency." The dishonest position is: "AI is destroying the planet, here's one scary number with no context." The former is science. The latter is content.

The critical thinking failure here is not that someone raised the question - the question is perfectly valid. The failure is that the question was immediately converted into a conclusion, skipping the entire middle part where you actually do the analysis. The MIT researchers asked for rigor. The media delivered outrage. Those are not the same thing.

For a more detailed breakdown: Andy Masley's analysis of the AI water narrative.

The Tool and the Architect

We build machines and computers as small extensions of ourselves, designed to repeat specific, repeatable tasks. We invent computer languages to tell these extensions how to operate. LLMs have mastered the "language" part of this process beautifully. They can help you bridge the gap between your idea and the code.

But they cannot come up with solutions for problems that we humans face, because they do not share our vulnerabilities. A machine can optimize a schedule, but it doesn't understand the "problem" of burnout. It can debug a script, but it doesn't understand the "problem" of a user's frustration. They can suggest a path, but they have no "skin in the game" to understand why one path is more meaningful than another.

Writing Is Thinking, Not Generating

Here is something that gets lost in the discourse: the point of written media is to organize thoughts, not to generate them.

We write journals so we can abstract our thoughts and reduce their complexity. The act of writing is itself a cognitive tool - it forces you to linearize a tangled internal state, to confront what you actually believe versus what you vaguely feel. When you outsource the writing, you aren't saving time. You are skipping the thinking.

An LLM is a communication tool. It can take your disorganized, half-formed ideas and help you express them clearly to another person or system. That is genuinely valuable. But if you hand it a blank page and say "write my essay," you have not thought. You have delegated the one activity that was the thinking.

This distinction matters enormously, and it is the one most people‐on both sides of the debate‐consistently miss.

The Hype Is the Business Model

The current AI hype cycle is not an accident. It is a financing mechanism.

Ken Griffin, CEO of Citadel (World Echonomic Forum - 3min videos in the annex), has spoken openly about this dynamic: the hype is necessary to justify the enormous capital expenditure required to build the infrastructure. Billions of dollars in GPU clusters, data centers, and energy contracts do not get funded by measured, nuanced academic papers. They get funded by narratives - narratives about transformation, disruption, and existential stakes.

This is not inherently sinister. It is how technology has always been financed. The railroad bubble of the 1840s overbuilt rail infrastructure that later became the backbone of industrial economies. The dot-com bubble funded the fiber optic cables and server farms that made the modern internet possible. Hype cycles overshoot, correct, and leave behind real infrastructure.

But understanding this dynamic is essential if you want to evaluate AI claims honestly. When someone tells you AGI is "two years away," ask yourself: who is funding their operations, and what does that funding depend on? CEOs have a vested interest in maintaining the hype cycle to justify their funding and operations. Remember the Self Driving car boom? Did we get them? No. But have all card got really good? yes!

AI was a research field among computer scientists for decades before this moment. What changed is not the fundamental science. It is the accessibility and the capital. The transformer architecture was published in 2017. GPT-2 was 2019. What made 2023 feel like a revolution was not a breakthrough in intelligence; it was a breakthrough in interface design. ChatGPT gave the public a text box, and suddenly everyone could interact with something that had been quietly developing in research labs for years.

The societal adoption curve looks less like "the invention of fire" and more like the adoption of the automobile. Cars existed for years before they became ubiquitous. What changed was infrastructure, affordability, and cultural integration - not the internal combustion engine itself.

The Knife Problem

You can use a knife to do very nasty things, or you can use it to cut vegetables.

This is not a dismissal of legitimate concerns. It is a reminder that the tool is not the ethics. Every powerful technology - from fire to nuclear energy to the internet - has dual-use potential. The response to dual-use technology is not to pretend it doesn't exist or to moralize about its nature. The response is to study it, understand it, and build frameworks for its responsible use.

We are computer scientists. It is literally in our title and job description to study this nuance. Taking a side - either "AI will save humanity" or "AI will destroy us", without doing the actual analysis is not critical thinking. It is fandom. And fandom with a degree is still fandom.

Let me frame it this way : If your take on AI is the same as the random journalist who's job is to generate hype, you're not being a scientist. You're being a fan.

YOU the future Computer Scientist have a responsibility to be more thoughtful and nuanced in your analysis of AI. YOU are literally the experts that the society relies on!

I’m not saying that you have to love AI. But if you want GenAI to be better, have the guts to work on the real problems — not just the rhetoric around a technology that is already here. Do it as a scientist: study it deeply, test it rigorously, and improve the systems that shape how it is used in the world.

The Elephant in the Room: Jobs

I'd be dishonest if I didn't address the concern that most people actually care about: labor displacement. Writers, translators, junior developers, customer service representatives‐these are real people with real livelihoods, and the anxiety is not irrational.

This article is about what LLMs are, not a comprehensive policy analysis of labor markets. But I'll say this: every major technological shift‐the printing press, the loom, the assembly line, the spreadsheet‐displaced specific jobs while creating others that didn't previously exist. That historical pattern doesn't make the transition painless for the individuals caught in it. It does, however, suggest that the correct response is adaptation and **laws and policy**, not prohibition.

The labor question deserves its own serious, dedicated discussion - one that involves economists, policymakers, and the workers themselves, not just technologists. What it does not deserve is to be conflated with the question of whether LLMs "think," which is a fundamentally different issue. Conflating the two makes both conversations worse.

On Being Radically Authoritative, Not Radicalized

There is a difference between having strong, well-supported views and being radicalized. The former requires you to do the reading, examine the evidence, stress-test your assumptions, and update your beliefs when the data warrants it. The latter requires you to pick a team and defend it.

The current media environment does not reward the former. It rewards the latter. Headlines are optimized for engagement, not accuracy. Nuance doesn't trend. And so the public discourse on AI has collapsed into two camps: utopians who think LLMs are a step toward digital consciousness, and catastrophists who think they're an existential threat to employment, the environment, and human dignity.

Both camps are wrong, and they're wrong in the same way: they are treating a tool as a character in a story. LLMs are not protagonists or antagonists. They are instruments. Extraordinarily powerful instruments, but instruments nonetheless.

The way AI is communicated in society today is deplorable. The news media has largely abandoned its role as a mechanism for informing the public and has instead become an engagement-optimization engine. That is a real and serious problem‐but it is a different, and much larger, problem than anything specific to AI.

I really don't want to be involved in politics or economics. What I can speak to is a pattern I know well from academia: I have had papers rejected with thoughtful feedback, careless feedback, and everything in between. That experience taught me that mockery is often just confidence without understanding. When someone dismisses something they have not studied, they are not being skeptical. A skeptic investigates. They are being incurious and calling it sophistication.

Read more about the Dunning-Kruger effect here: Dunning-Kruger effect, and here is a video explanation.

I try to stay within my expertise, and I'd encourage you to do the same - not by ignoring AI, but by studying it before deciding what you think about it. That is what scientists do.

At the end …

Just because something speaks well does not mean it is an expert. And just because something is hyped does not mean it is worthless. Both of these errors (over-attribution and reflexive dismissal) come from the same place: a failure to do the actual work of understanding.

LLMs are Large Language Models. They are excellent communicators. They can help you articulate an idea to another human or interface with another system. Use them for their ability to communicate effectively, but don't mistake a masterful command of the "alphabet" for the ability to reason through the human experience.

And for the love of critical thinking‐read the actual studies before citing them as evidence for a position they don't support.