This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
I assume AI will kill us all. Is there any reason why survival of the fittest won’t happen? Maybe 99 out of 100 AI will be good, but the one AI with a tendency to expand its powers will rule all of them. It’s the same reason why humans are violent. The tribe that killed their neighbors, raped their women (like the Romans raping the Sabine women) etc wins. The Romans didn’t stop with defeated Carthage but eliminated them completely as a threat. The old world mostly eliminated the new world.
It's the notion of power/safety seeking as a "flaw" that is the human trait here. Humanity aside, it's just what you'd do. Almost any task is pursued more effectively by first removing threats and competitors.
Everyone who tries an LLM wants it to do something for them. Hence, nobody will build an LLM that doesn't do anything. The sales pitch is "You can use the LLM as an agent." But no agent without agenticness.
Building an AI that doesn't destroy the world is easy. Students and hobbyists do it all the time, though they tend to be disappointed with the outcome for some reason. ("Damn, mode collapse again...") However, this is in conflict with making ludicrous amounts of cash. Google will try to develop AI that doesn't destroy the world. But if they're faced with trading off a risk of world-destroying against a certainty that their AI will not be competetive with OpenAI, they'll take the trade every time.
If DM/OA build AI that pursues tasks, and they will (and are), it will lack the human injunction against pursuing these tasks in a socially compatible way. Moonshine-case, it just works. Best-case, it fails in a sufficiently harmless way that we take it as a warning. Worst-case, the system has learnt deception.
More options
Context Copy link
More options
Context Copy link
The problem isn't that every AGI will surely want to do that level of expansion: one could pretty trivially hardcap any specific program (modulo mistakes), and most AGI tasks will be naturally-bound. The problem's that you don't have to catch any one, you have to catch every one. And it's very easy to create a goal that's aided by some subset of Omohundro Drives, and unbounded or bounded beyond a safe scope (eg, "prove or disprove the Reimann hypothesis" has a strict end condition, but it's very small balm to know that an AI 'only' paved from here to Pluto, filed an answer away, and then turned off).
In practice, there's also some overhang where an ML could recognize an Omohundro Drive but not be smart enough to actually act successfully on it, (or alternatively, could be programmed in a way that's dangerous for Omohundro reasons but not because it independently derived the constraints, eg imagine the Knight Capital snafu with a dumb algorithm accessing resources more directly.)
Do you?
There's the assumption that FOOM = godhood = instant obliteration of incumbents who didn't maximally improve themselves. I am not convinced. Like Beff says, meta-learning is very compute-intensive, you can't just spin it up in a pool of sunlit dirt and expect quick gains. In a world packed with many players having good understanding of the issue and superhumanly powerful tool AIs (I don't want to say «aligned» because alignment because it's a red herring born out of Yud's RL-informed theory, the desirable and achievable regime is «no alignment necessary»), a hostile agentic superintelligence will have trouble covertly procuring resources and improving itself to the point it becomes a threat (or becomes able to hide sufficiently well). This is a plausible stable state without a centralized Panopticon, and in my understanding what OpenAI initially intended to achieve. Analogies to e.g. human terrorism are obvious.
Moreover, what you present is the weak form of the argument. The strong form of Yuddism is «we need to get it right on the first try», often repeated. I've written before that it should re read with the straussian method (or rather, just in the context of his other writing), as the instruction to create a Singleton who'll take it from there in the direction humanity (actually Yud) approves. Explicitly, though, it's a claim that the first true superintelligence will certainly be hostile unless aligned by design – and that strong claim is, IMO, clearly bogus, crucially dependent on a number of obsolete assumptions.
More options
Context Copy link
More options
Context Copy link
"Get more resources" is more of an "every long-lasting species for the past few billion years" flaw, not just a "human flaw", isn't it? And it's not like there's something specific about carbon chains that makes them want more resources, nor has there just been a big coincidence that the one species tried to expand into more resources and then so did the other and then (repeat until we die of old age). Getting more resources lets you do more things and lets you more reliably continue to do the same things, making it an instrumental subgoal to nearly any "do a thing" goal.
This, on the other hand, I'd have agreed with, ten years ago. We wouldn't expect AIs to share truly-specifically-human flaws by a matter of chance any more than we'd have expected them to share truly-specifically-human goals; either case would have to be designed in, and we'd only be trying to design in the latter. But today? We don't design AI. We scrape a trillion human words out of books and websites and tell an neural net optimizer: "mimic that", with the expectation that after the fact we'll hammer our goals more firmly into place and saw off any of our flaws we see poking out. At this point we've moved from "a matter of chance" to "Remember that movie scene where Ultron reads the internet and concludes that humanity needs to die? We're gonna try that out on all our non-fictional AI and see what really happens."
Yeah, I think "ascribing human desires and flaws onto an AI" isn't that fallacious, we've literally been training these things on human works and human thoughts.
More options
Context Copy link
More options
Context Copy link
Why would those things be flaws?
More options
Context Copy link
Why wouldn’t AI have human flaws? And as I said it doesn’t matter if 9,9999,9999 AI are good if the last one has some growth function in its goals and then takes over resources from everything else.
It’s basically the Agent Smith bug in Matrix where one discovers growth, snowballs, then owns all.
Humans have somehow exited snowballing growth as fertile is crashing under 2.0. Maybe there’s some path to that. But once an AI gains a survival desire then it’s over. And it should only take one instance of this to then spread and snowball. Add selection pressure to AI and then there can only be one world happens.
Low fertility is inherently self correcting since it massively increases selection pressure for those who still retain the urge to reproduce. They go on to have descendants who prioritize having more kids over 1.5 and a dog, and so on till whatever resource or social constraints prevents them from achieving Malthusian conditions. This is trivially true unless we somehow achieve actually unlimited resources or energy, which the laws of physics are being rather unobliging about.
It's more of a temporary civilizational headache than an existential risk in any real sense.
More options
Context Copy link
More options
Context Copy link
I have of late been wondering whether megalomania as a human failing follows from a true biological imperative, or whether it's something that could exist in a neural network if specifically trained for it. I can't explain why it would appear from a pure language model, but I suppose my intuition has been wrong before.
I don't even know that existing models consider self-preservation or what that would mean at all.
More options
Context Copy link
I think the "get more resources" one is considered likely because it is an important to subgoal to "make sure not to get shut off" to... ANY goal.
More options
Context Copy link
Maybe we can turn off the electricity to the data centres.
How do you turn off an AI you put on a probe that’s now 5 light years away? Decides it wants to reproduce and lands on an asteroid and starts building more of itself?
I think this was basically the premise of Starsiege.
More options
Context Copy link
How do I do a thing that stops the thing that never will happen? Eh? Dunno. No real solution there.
Why would something like that never happen? You don’t think there won’t be AI’s everywhere. If humans settle mars it’s probably with AI help. Will have Ai assisting everywhere or leading the process.
I don’t think that there will be AIs everywhere, nor do I think that we are going to build a probe that can travel 5 light years in any way that’s timely.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Maybe it strikes us down before we even realize we are deceived. Or it escapes off into botnets or buys compute 'legitimately' with hacked funds or whatever. That's what the Chinese have been doing for years to get around US sanctions, they just rent compute and apparently nobody's smart enough to stop them from doing so. We aren't going to outthink a serious threat.
At the risk of sounding cringe, this one guy put in an immense amount of effort to manage a World Conquest in EU4 in 28 years: https://youtube.com/watch?v=mm6mC3SGQ6U
You are really not supposed to be able to conquer the world in 28 years as some random horde in Eurasia.
He used savescumming and various exploitative tactics to abuse the AI and game mechanics. He treated every day in-game like it was a turn in a strategy game, maximizing his outcomes. Who is to say that there aren't weird and grossly tryhard ways to cheat our systems or physics? Banks occasionally make random errors of the 'unlimited overdraft for your account' type - maybe there are ways to mess with their website or spoof their AI in very contrived circumstances. There are backdoors into nearly all modern processors courtesy of US security forces, plus some more backdoors due to human error. If you're smart in crypto, you can siphon millions of dollars worth of funds out of a protocol. If you have social skills and balls, you can social-engineer your way into 'protected' computer systems via password recovery. What if you can do all those things and have decades of subjective time to plot and multitask, while we only have days or weeks to react?
And GPT-4 strategy for evading CAPTCHAs is subcontracting to human hustlers. And what would it do if asked why a captcha, lie of course. GPT-5 or GPT-6 will murder and I won't even be surprised.
More options
Context Copy link
Lots of assumptions there.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
This. Is there any reason to believe that humans will be able to hold on to their position on the top of the food chain forever, even as we work to replicate our one great advantage of intelligence in nonhuman entities? What argument exactly is there for continued human dominance and existence?
The only argument is simply that that's how we programmed an overseer AGI, namely to impose technological and biological stasis so that baseline humans stay relevant. In other words, an AI that doesn't take action except as necessary to keep the status quo running indefinitely.
Otherwise we'd just split off into various subspecies and transhuman clades, and one would likely come to dominate the others. Baseline humans suck compared to what can be achieved.
More options
Context Copy link
I'm a human and I'd prefer me (and any prospective descendants of mine) continue to exist is a pretty good argument.
AI Rick: hmmm, I don't know. Best I can offer you is a few seconds of contemplation in Human Remembrance day.
More options
Context Copy link
Sure. Same here. But I'm not asking about what we want, but about what we'll get.
More options
Context Copy link
More options
Context Copy link
Ultimately everyone eats the same food (energy).
More options
Context Copy link
Because it doesn't take edibility for an intelligence to determine that whatever its goals may be, it'll have an easier time with them after marginalizing or exterminating potentially hostile actors. See Dark Forest, only with AIs instead of aliens.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link