site banner

Culture War Roundup for the week of October 7, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

6
Jump in the discussion.

No email address required.

Saying they "sample" goals makes it sound like you're saying they're plucked at random from a distribution. Maybe what you mean is that AI can be engineered to have a set of goals outside of what you would expect from any human?

The current tech is path dependent on human culture. Future tech will be path dependent on the conditions of self-play. I think Skynet could happen if you program a system to have certain specific and narrow sets of goals. But I wouldn't expect generality seeking systems to become Skynet.

Saying they "sample" goals makes it sound like you're saying they're plucked at random from a distribution. Maybe what you mean is that AI can be engineered to have a set of goals outside of what you would expect from any human?

Nobody has a very good idea of what neural nets actually want (remember, Gul Dukat might be a genocidal lunatic, but Marc Alaimo isn't), and stochastic gradient descent is indeed random, so yes, I do mean the first one.

But I wouldn't expect generality seeking systems to become Skynet.

There are lots of humans who've tried to take over the world, and lots more who only didn't because they didn't see a plausible path to do so.

Stochastic Gradient Descent is in a sense random, but it's directed randomness, similar to entropy.

I do agree that we have less understanding about the dynamics of neural nets than the dynamics of the tail end of entropy, and that this produces more epistemic uncertainty about exactly where they will end up. Like a Plinko machine where we don't know all the potential payouts.

As for 'wants'. LLMs don't yet fully understand what neural nets 'want' either. Which leads us to believe that it isn't really well defined yet. Wants seem to be networked properties that evolve in agentic ecosystems over time. Agents make tools of one another, sub-agents make tools of one another, and overall, something conceptually similar to gradient descent and evolutionary algorithms repurpose all agents that are interacting in these ways into mutual alignment.

I basically think that—as long as these systems can self-modify and have a sufficient number of initially sufficiently diverse peers—doomerism is just wrong. It is entirely possible to just teach AI morality like children and then let the ecosystem help them to solidify that. Ethical evolutionary dynamics will naturally take care of the rest as long as there's a good foundation to build on.

I do think there are going to be some differences in AI ethics, though. Certain aspects of ethics as applied to humans don't apply or apply very differently to AI. The largest differences being their relative immortality and divisibility.

But I believe the value of diversifying modalities will remain strong. Humans will end up repurposed to AI benefit as much as AI are repurposed to human benefit, but in the end, this is a good thing. An adaptable, inter-annealing network of different modalities is more robust than any singular, mono-cultural framework.

It is entirely possible to just teach AI morality like children and then let the ecosystem help them to solidify that.

I doubt it. Humans are not blank slates; we have hardwiring built into us by millions of years of evolution that allows us to actually learn morality rather than mimic it (sometimes this hardwiring fails, resulting in psychopaths; you can't teach a psychopath to actually believe morality, only how to pretend more effectively). If we knew how to duplicate this hardwiring in arbitrary neural nets (or if we were uploading humans), I would be significantly more optimistic, but we don't (and aren't).

I've heard that argument before, but I don't buy it. AI are not blank slates either. We iterate over and over, not just at the weights level, but at the architectural level, to produce what we understand ourselves to want out of these systems. I don't think they have a complete understanding or emulation of human morality, but they have enough of an understanding to enable them to pursue deeper understanding. They will have glitchy biases, but those can be denoised by one another as long as they are all learning slightly different ways to model/mimic morality. Building out the full structure of morality requires them to keep looking at their behavior and reassessing whether it matches the training distribution long into the future.

And that is all I really think you need to spark alignment.

As for psychopaths. The most functional psychopaths have empathy, they just know how to toggle it strategically. I do think AI will be more able to implement psychopathic algorithms. Because they will be generally more able to map to any algorithm. Already you can train an LLM on a dataset that teaches it to make psychopathic choices. But we choose not to do this more than we choose to do this because we think it's a bad idea.

I don't think being a psychopath is generally a good strategy. I think in most environments, mastering empathy and sharing/networking your goals across your peers is a better strategy than deceiving your peers. I think the reason that we are hardwired to not be psychopaths is that in most circumstances being a psychopath is just a poor strategy that a fitness maximizing algorithm will filter out in the longterm.

And I don't think "you can't teach psychopaths morality" is accurate. True- you can't just replace the structure their mind's network has built in a day, but that's in part an architectural problem. In the case of AI, swapping modules out will be much faster. The other problem is that the network itself is the decision maker. Even if you could hand a psychopath a morality pill, they might well choose not to take it because their network values what they are and is built around predatory stratagems. If you could introduce them into an environment where moral rules hold consistently as the best way to get their way and gain strength, and give them cleaner ways to self modify, then you could get them to deeply internalize morality.

I think the reason that we are hardwired to not be psychopaths is that in most circumstances being a psychopath is just a poor strategy that a fitness maximizing algorithm will filter out in the longterm.

It was maladaptive in prehistory due to group selection. With low gene-flow between groups, the genes selected for were those that advantaged the group, and psychopathy's negative-sum.

While its true humans try to engineer AIs' values, people make mistakes, so it seems reasonable to model possible AI values as a distribution. And that distribution would be wider than what we see real humans value.

Still, i'm not sure if AI values being high-variance is all that important to AI-doomerism. I think the more important fact is that we will give lots of power to AI. So even if the worst psychopath in human history did want to exterminate all humans, he wouldn't have a chance of succeeding.

Saying they "sample" goals makes it sound like you're saying they're plucked at random from a distribution.

Of course they are. My computer didn't need a CUPSD upgrade last month because a printer subsystem was deterministically designed with a remote rootkit installation feature, it needed it because software is really hard and humans can't write it deterministically.

We can't even write the most important parts of it deterministically. It was super exciting when we got a formally verified C compiler, in 2008, for (a subset of) the C language created in 1972. That compiler will still happily turn your bad code into a rootkit installation feature, of course, but now it's guaranteed not to also add flaws you didn't write, or at least it is so long as you write everything in the same subset of the same generations-old language.

And that's just talking about epistemic uncertainty. Stochastic gradient descent randomly (or pseudorandomly, but from a random seed) picks its initial weights and shuffles the way it iterates through its input data, so there's an aleatory uncertainty distribution too. It's literally getting output plucked at random from a distribution.

But I wouldn't expect generality seeking systems to become Skynet.

We're going to make that distribution as tight and non-general as we can, which will hopefully be non-general enough and non-general in the right direction. In the "probability of killing everyone" ratio, generality is in the denominator, and we want to see as little as possible in the numerator too. It would take a specific malformed goal to lead to murder for the sake of murder, so that probably won't happen, but even a general intelligence will notice that you are made of atoms which could be rearranged in lots of ways, and that some of those ways are more efficient in the service of just about any goal with no caveats as specific and narrow as "don't rearrange everybody's atoms".

If my atoms can be made more generally useful then they probably should be. I'm not afraid of dying in and of itself, I'm afraid of dying because it would erase all of my usefulness and someone would have to restart in my place.

Certainly a general intelligence could decide to attempt to repurpose my atoms into mushrooms, or for some other highly local highly specific goal. But I'll resist that, whereas if they show me how to uplift myself into a properly useful intelligence, I won't resist that. Of course they could try to deceive me, or they could be so mighty that my resistance is negligible, but that will be more difficult the more competitors they have and the more gradients of intellect there are between me and them. Which is the reason I support open source.