site banner

Danger, AI Scientist, Danger

thezvi.wordpress.com

Zvi Mowshowitz reporting on an LLM exhibiting unprompted instrumental convergence. Figured this might be an update to some Mottizens.

9
Jump in the discussion.

No email address required.

Diffusion models work for text too.

Last I checked, diffusion models work at all for text but they don't work very well. More specifically, text diffusion models remind me quite strongly of the classic-style Markov chain text generators that used to be popular for generating amusing locally-coherent-globally-word-salad text. Here's the best concrete example of this I can point to (italicized text is the prompt, normal text is continuation, examples sourced from this JDP tweet, whole tweet is excellent but somewhat OT here):

Diffusion model:

This whole dream seems to be part of someone else's experiment.

In the end, I think the only way to get the full benefit of the dream is to get up and walk around.

A lot of the other strands of the theory of dreams, especially the one called yncrest are known to the person who writes this blog. Consequently, I've tested his reactions via a series of brief, televised interviews with the British Prime Minister, Margaret Thatcher.

“I don't care what anybody says. I'm not going to waste my time with that. The study was conducted in the high-flow, low-flow and quiescent periods of a butterfly species and thus provides evidence for the first three of the four hypotheses listed in the original paper. The only one not covered is the "need to change the rate of capture", where we now know that this is largely a policy instrument - and particularly relevant to Sudanese geese.

GPT-2:

This whole dream seems to be part of someone else's experiment. This is usually hard to see, since you're conscious that mental shortcuts are normal. But you're definitely moments away from being normal again, as it's obvious that you are being coached to think like that by another person and doing our own thing. Why are we having such thoughts when we're paying no attention because that's already taking place?

I'm totally appalled by what these key players were doing, yet I explain it in a very large way. After 2 minutes, I make my way through high teaching circles, recognizing that there is more to learning than just to learn.

Few other teachers would air this incessantly their students' struggles recount the stories of their own memories and teachers'. Whatever it takes is the big fat slut shaming.

All these legendary trainees made the same point:

Cognitive functional aspects: Bayesian modeling, learn science way. And the most important part is: Like the coming of a hurricane, the book is mucking between science and morals.

Twitter Mentions of the first book: Kent

Now obviously in the limit as computational power and training data volume go to infinity, diffusion models and transformer models will generate the same text, since in the limit they're pulling from the same distribution with minimal error. But in the very finite regime we find ourselves in, diffusion models "spend" their accuracy on making the text locally coherent (so if you take a random 10 token sequence, it looks very typical of 10 token sequences within the training set), while transformer LLMs "spend" their accuracy on global coherence (so if you take two 10 token sequences a few hundred tokens apart in the same generated output, you would say that those two sequences look like they came from the same document in the training set).

The blending of concepts that we see in MidJourney is probably less to do with the diffusion per se as with CLIP

Agreed. Obvious once you point it out but I hadn't thought about it that way before, so thanks.

'Self play' is relevant for text generation. There is a substantial cottage industry in using LLMs to evaluate the output of LLMs and learn from the feedback.

Notably, Anthropic's Constitutional AI (i.e. the process by which Anthropic turned a base LLM into the "helpful, honest, harmless" Claude) process used RLAIF, which is self play by another name. And that's one big cottage.