Like many people I've been arguing about the nature of LLMs a lot over the last few years. There is a particular set of arguments that I found myself having to recreate from scratch over and over again in different contexts, so finally put it together in a larger post, and this is that post.
The crux of it is that I think both the maximalist and minimalist claims about what LLMs can do/are doing are simultaneously true, and not in conflict with one another. A mind made out of text can vary along two axes, the quantity of text it has absorbed, which here I call "coverage," and the degree to which that text has been unified into a coherent model, which here I call "integration." As extreme points on that spectrum, a search engine is high coverage, low integration, and an individual person is low coverage, high integration, and LLMs are intermediate between the two. And most importantly, every point on that spectrum is useful for different kinds of tasks.
I'm hoping this will be a more useful way of thinking about LLMs than the ways people have typically talked about them so far.
Jump in the discussion.
No email address required.
Notes -
Thanks. I haven't read enough of less wrong and haven't posted anything there so don't know what the posting norms are. But I do think it's the kind of thing they'd be interested in.
I didn't know you were semi-famous, if Dase knows you.
@moultano is profoundly insightful on the bigger picture of ML and, moreover, is a creative writer and thinker.
My personal shortlist:
One of the earliest and best explorations of the power of text-to-image generative AI and its more technically impressive sequel, roughly when I first took interest in it and did 1 and 2 and 3. Note how we've lost this kind of high-level semantic control and, I'd say, artistic magic, even as diffusion models became immensely powerful and Midjourney easily evokes waow! reactions from the masses of redditors. It's not the limit of how subregions of the image could've been manipulated with VQGAN+CLIP either. @FCfromSSC and @Primaprimaprima might find this to be of interest.
Already mentioned Why Deep Learning Works Even Though It Shouldn’t, an intuition pump for the lottery ticket hypothesis.
The Defaults Don't Work is perhaps the best thing written on the inadequacy of tradition. «Abundance» in the same category.
The Doves as an example of good fiction.
This amusing writeup on how Crimethink policing works on social media platforms (or, used to back in 2019).
The rest is about as good though.
More options
Context Copy link
I have a feeling this post could get to the top of HN and become famous, but it's too long to get off of new without a catchier title than I've been able to come up with.
Ok nevermind, it made it. I probably should have asked an LLM to optimize the title for me.
More options
Context Copy link
More options
Context Copy link
I think a handful of the things I've written are semi famous, not particularly me. 😆
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link