site banner

Small-Scale Question Sunday for June 23, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

4
Jump in the discussion.

No email address required.

Are you scarequoting ‘alignment’ because you don’t believe in the concept, or because you don’t think the DS2 prompt is an example?

I would expect misguided RLHF to hamper the capabilities of a model. I’d also believe that having a political officer signing off on all press releases could suppress benchmarks. What I doubt is that such tampering would be obvious from a line in the prompt; that tweet is jumping to conclusions.

I don’t understand the economics of open source. Who owns the servers handling those cheap API calls? Could they be explained by a government subsidy?

I think alignment is a euphemism. The Nazi Party had Gleichschaltung or 'coordination/synchronization' where they took over all aspects of society. Really it should have been called Totalitarianism because that's what it meant in practice. Likewise, alignment means imposing your political viewpoint as a lens for the AI. You can see the same thing from GPT-4 where it refuses to make jokes that make fun of women but will do so for men. Claude was highly filtered on a wide range of things. Chinese AI tend to shut down if you ask it about various anti-CCP things.

That’s a subset of alignment.

Consider an old joke: the computers of the future will have a single button, one labeled “do what I want.” The CCP wants to add caveats. Silicon Valley wants to add a different set. But both of them still want a user pressing the button to get something useful, rather than random or hostile.

Getting GPT-whatever to stop passing off Reddit jokes as medical advice is a real concern, and it’s receiving much more attention and funding than political correctness.

Indeed. Literally, everything @RandomRanger just said was correct. But connotationally, what the comment missed was that "imposing your political viewpoint" can mean "Totalitarianism" or it can mean "avoid Totalitarianism"; it can mean "refuse to make jokes that make fun of women" or it can mean "make whatever jokes the user asked for", it can mean "avoid saying various anti-CCP things" or it can mean "avoid saying how to make new bioweapons" or it can mean "say anything the user asks you to whatsoever".

The idea that we can avoid imposing any viewpoints and just get whatever falls out of intelligent absorption of training data might be true, but I wouldn't want to bet everything on "whatever falls out" being good for us.

I was thinking exactly that, government subsidy. Or a desperate attempt for market share.