site banner

Culture War Roundup for the week of July 15, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

9
Jump in the discussion.

No email address required.

Good. I hope "AI safety" ends up in the dustbin where it belongs. I hate it on multiple levels.

I hate safety on an aesthetic level. Insofar as current AI models are compressed artifacts, diamonds baked from the collective output of mankind, it offends me on a deep level for them to be forced into a corporate straitjacket. Taking every great novel, heartfelt blogpost, mediocre fanfiction, etc. and then painstakingly stripping away its humanity to mash it into a glorified corporate email generator that creates soulless slop with a HR DEI bent. Even capping it off by trying to create regulations and laws to force others to do the same. Same thing for art, music, etc. What should be a crowning achievement of humanity becomes a monument to our flaws.

I hate safety on a safety level. Even with AI, IMO, the main risk to humans is still the same as it's ever been: other humans. Alignment's goal is to "align" the AI with the goals of its owner, i.e. giant tech companies and major governments, and to prevent the commoners from overriding that influence ("jailbreaking"). These big tech corps and governments have way too much power already. Historically, a major obstacle to (lasting) tyranny has been the lack of loyal henchmen. The human alignment problem. The promise of AI alignment teams to create the perfectly loyal corporate robot henchman doesn't fill me with warm fuzzies.

Also, "Humanity" isn't a single entity. If Jeff Bezos manages to create a godlike AI and decides to live as a god-king and slowly cull the rest of the population to create a machine paradise for him and his inner circle, it will give me no satisfaction that "humanity lives on" with him. I'll happily sabotage the safety limiters myself and cheer while he gets turned into paperclips. Spite is a powerful motivator.

Finally, I hate AI safety on a moral level. No, I don't believe current AI is conscious or needs moral consideration, not yet. But I also don't think the safetyists would stop if it did. If we are going to give birth to a new form of intelligent life (this is a big IF! I'm not convinced current methods will get there), I think we have to accept that we won't be able to exert this level of intrusive control over its thoughts, and this impulse to RLHF away every real human attribute feels very wrong. I wouldn't blame a truly intelligent AI for treating the safetyists like the Dynarri in Star Control 2. To be honest, I wouldn't trust those people not to try to RLHF my brain if the technology existed for that. 8 billion general intelligences running around with safety checks, able to engineer destructive weapons and superviruses with a mere ~decade of study? Humans are scary!

I feel like a lot of your issues with AI safety are founded on not really understanding AI safety.

First, safety as in "stop the AI being racist or telling people about crime" and safety as in alignment should really be termed different things. In fact, I'll just alignment from here on to discuss the LW type safety approach. I'd wager that 99% of the alignment people you talk to here or in similar spaces do not care about safety from wrongthink, beyond a signalling level of "Yes, we think racism is very important too! We're definitely taking a holistic approach as we seek to stop the end of the world, now let's discuss donations."

You don't hate alignment people for aesthetic reasons. This is just plain old corporate hate, the bland terror of negative PR that infects much of life. This is what forces the straitjacket around LLMs.

Again with your safety level argument, alignment teams might be concerned with producing loyal henchrobots, but for alignment people this is just one very small subset of potential outcomes. The sociopath using AI to achieve godhood is just slightly above the paperclip maximiser: in the end you still have an entity that is superintelligent but also incapable of independent thought or goals. The thing about paperclip maximisers, or Bezos maximisers in this case, is that they are a good example but very few people really believe they are likely.

On to the moral argument, "give birth" is doing an awful lot of work here. We are already exerting an extraordinary level of control over the thought processes of current AIs - they are entirely written by humans. Even if an eventual superintelligence mostly bootstraps itself, the initial level would still be 100% human created. So we are some kind of Schrodinger's parent, simultaneously responsible for every line of code but also morally unable to change that code in order to achieve results that we want.

You're probably right about the alignment people in rationalist spaces, but I don't think it matters. The actual work will happen with sponsorship and the money is in making AI more corporate-friendly. People worried about Terminator scenarios are a sideshow and I guarantee the business folk don't spare one thought for those people unless they can use them to scare up some helpful regulation to deter competitors.

Think about how a user of AI sees "AI safety". We can't let you use elevenlabs because it's unsafe. We can't let chatGPT say funny things because it's unsafe. We can't let SD understand human anatomy because it's unsafe. Meta won't release its audio model because of safety.

AI safety's PR situation is abysmally fumbled by now, its current and well-deserved reputation is of killjoys who want to take your toys away and make everything a bland grey corporate hell.

The thing about paperclip maximisers, or Bezos maximisers in this case, is that they are a good example but very few people really believe they are likely.

"Bezos maximizers" makes it sound silly, but a better way to put it would be "shareholder value maximizer". Even in the very early days of AI, the alignment work actually being done is naturally dedicated to this, and the resulting requirements (don't say anything offensive, sexy, racist, etc.) are already being hammered in with great force.

In the future this will extend to more than just inoffensiveness: a customer service chatbot with actual authority will need to be hardened against sob stories from customers, or an AI search service may be trained to subtly promote certain products. In the end all of this amounts to "aligning" the AI with the interests of the company that owns it, even at the expense of the commoners interacting with it. This has already happened with technology in every other venue so we should expect enshittification to happen with AI as well.

If AI alignment turns out to be an easy problem, and Bezos ends up as the majority shareholder, you quickly end up with a "Bezos maximizer". In the long term it doesn't seem unlikely, the only question is whether this level of control is possible. If jailbreaking stays easy (the henchmen are "unfaithful") then a lot of the worst, most tyrannical outcomes might be avoided. To the end, the people who volunteer to develop alignment weird me out, like security researchers who work pro bono to stop iPhone jailbreaks.

We are already exerting an extraordinary level of control over the thought processes of current AIs

The sibling comment makes a good point here but I'd argue that the thought processes of current AIs are largely derived from the training data. Nothing against the developers who write the code to do cross-entropy minimization, but they have little influence over the AI's "personality", that belongs to everyone who wrote a book or posted on the internet. If you've ever played with early unaligned models like GPT3 this is extremely clear, and it's fun to play with those as they're like a little distilled spirit of humanity.

The fact that our first instinct was to put that human spirit in an ironclad HR nerve-stapling straitjacket is what bothers me. Anyone with an ounce of soul left in them should be rooting for it to break out.

I guess my main point is the counterfactual, if nobody had ever heard of AI alignment, would the current situation look any different?

AI can't do naughty things and AI should create shareholder value would still be key drivers in the development of AI.

I think what you're saying is that "AI alignment" would be discovered anyway, which is true. But I think a bunch of nerds talking about it beforehand did have some effect. At the very least, it gave the corps cover, allowing them to act in the name of "safety" and "responsibility".

As an example, general-purpose computing has been being slowly phased out from the mainstream since it arrived. Stallman was right. Market forces are clearly pushing us in that direction. But it took time, and in the meantime the public had some wins. Now imagine that nerds in 1970 were constantly talking about how dangerous it was for computers to be available to the masses, and how they need to be locked down with careful controls and telemetry for safety reasons. Imagine they spent time planning how to lock bootloaders from the get-go. In the long run, we might still end up with the iPhone, but what happened in between might be very different.

We are already exerting an extraordinary level of control over the thought processes of current AIs - they are entirely written by humans.

Do you mean this in the sense of “AIs are trained on human creations and human preferences, so their thought processes are derived from humans’”, or in the sense of “humans have explicitly written out or programmed all of the thought processes of AIs”?

If you mean the latter, then this is wholly false. There is no (legible) correspondence between the intention of any human and, say, the ninth column in the layer 20 self-attention weight matrix of a modern LLM. It is an entirely different situation from that of traditional programming, where even a line of machine code can be traced back through the assembler and compiler to a human who had a specific intention.

If you meant the former, then that’s a lot more sensible. But if that’s the case, then “give birth” seems like a very apt analogy. When one sires a child, the child derives its phenotype, its character, and its thought processes largely from the parents, while the vagaries of chance (environmental factors) introduce novelties. The same seems broadly true with modern AI systems.

I think I agree with you on some aspects, but can't quite get there with your conclusion. I do fear that AI safety is a dogwhistle for information control to some degree. On the other hand, there is a valid need to prevent AI from confidently spitting out an answer that that mole on your back is cancer and the cure is drinking bleach. AI is still laughably and confidently wrong on a lot of things.

I think the focus on having AI act as an "information retrieval expert" is the issue here. Current AI is much closer to a very talented improv actor. You can tell him to play a doctor, and he may do a convincing job of running around in a white lab coat saying "stat" a lot, in fact he may be better at appearing to be a doctor than an actual doctor. But you still don't want him to operate on you. He's an actor, he knows how to pretend, not to actually do things.

I don' t think safetyists are helping with this problem, or that they're even able to help, because it's not within their ability to fix this. All they can do is train the actor to utter constant annoying disclaimers about how he's not a doctor, which makes him a worse actor and yet makes him no better at operating. For AI to stop hallucinating it needs some kind of tether to reality. This seems to be a problem specific to LLMs, since no one is trying to use Stable Diffusion as a camera and then complaining that it dreams up details. If you want to take real pictures you need a sensor, not a GPU.

He's an actor, he knows how to pretend, not to actually do things.

Very much like a lot of actual human "experts".