Experts Have World Models. LLMs Have Word Models

41 points - today at 6:13 PM

Comments

notnullorvoid today at 10:40 PM

Great article, nice to see some actual critical thoughts on the shortcomings of LLMs. They are wrong about programming being a "chess-like domain" though. Even at a basic level hidden state is future requirements, and the adversary is self or any other entity that has to modify the code in the future.

AI is good at producing code for scenarios where the stakes are low, there's no expectation about future requirements, or if the thing is so well defined there is a clear best path of implementation.

dataminer today at 10:33 PM

so at the moment combination of expert and llm is the smartest move. llm can deal with 80% of the situations which are like chess and expert deals with 20% of situations which are like poker.

swyx today at 7:37 PM

editor here! all questions welcome - this is a topic i've been pursuing in the podcast for much of the past year... links inside.

measurablefunc today at 8:14 PM

Makes the same mistake as all other prognostications: programming is not like chess. Chess is a finite & closed domain w/ finitely many rules. The same is not true for programming b/c the domain of programs is not finitely axiomatizable like chess. There is also no win condition in programming, there are lots of interesting programs that do not have a clear cut specification (games being one obvious category).

naasking today at 7:50 PM

I think it's correct to say that LLM have word models, and given words are correlated with the world, they also have degenerate world models, just with lots of inconsistencies and holes. Tokenization issues aside, LLMs will likely also have some limitations due to this. Multimodality should address many of these holes.

darepublic today at 8:06 PM

Large embedding model

akomtu today at 9:00 PM

Llame Word Models.

SecretDreams today at 8:21 PM

Are people really using AI just to write a slack message??

Also, Priya is in the same "world" as everyone else. They have the context that the new person is 3 weeks in and must probably need some help because they're new, are actually reaching out, and impressions matter, even if they said "not urgent". "Not urgent" seldom is taken at face value. It doesn't necessarily mean it's urgent, but it means "I need help, but I'm being polite".

calf today at 8:59 PM

My Sunday morning speculation is that LLMs, and sufficiently complex neural nets in general, are a kind of Frankenstein phenomenon, they are heavily statistical, yet also partly, subtly doing novel computational and cognitive-like processes (such as world models). To dismiss either aspect is a false binary; the scientific question is distinguishing which part of an LLM is which, which by our current level of scientific understanding is virtually like trying to ask when is an electron a wave or a particle.

nwhnwh today at 8:29 PM

[flagged]

D-Machine today at 6:29 PM

Fun play on words. But yes, LLMs are Large Language Models, not Large World Models. This matters because (1) the world cannot be modeled anywhere close to completely with language alone, and (2) language only somewhat models the world (much in language is convention, wrong, or not concerned with modeling the world, but other concerns like persuasion, causing emotions, or fantasy / imagination).

It is somewhat complicated by the fact LLMs (and VLMs) are also trained in some cases on more than simple language found on the internet (e.g. code, math, images / videos), but the same insight remains true. The interesting question is to just see how far we can get with (2) anyway.