site banner

Culture War Roundup for the week of March 27, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

11
Jump in the discussion.

No email address required.

I've long disagreed with your assertion that our best bet for preventing a hostile singleton regime that leverages a superhuman AGI is encouraging the development of more AGIs in the hope they provide checks and balances.

To boil it down into more concrete reasons for disagreeing:

  1. I think the existential risk from misaligned AGI is significantly higher than mere subjugation and pacification by an "Aligned" AGI controlled by someone who isn't utterly altruistic.

The latter world likely promises significantly increased wealth and quality of living even if the tradeoff is loss of freedom and agency. Not a trade I'd happily make, but it strikes me as grossly superior to just dying by transmutation into a paperclip.

Keep in mind that a singleton AI regime will have access to nigh unimaginable wealth, they have nothing standing in their way from taking over the lightcone as fast as propulsion systems allow. It would be a rounding error on a floating point error for them to coddle the losers with incredible wealth and UBI. I don't think even the most authoritarian regimes like China are actively genocidal in the sense they want to kill all other humans instead of merely subjugating and eventually assimilating them, and actors like Altman and OpenAI strike me as even less likely to do such a thing. They can drop us proles table crumbs, and they'd appear like asteroids packed to the brim with gold.

  1. I contend that each independent AI initiative is multiplicative on our odds of dying by paperclipping. They significantly increase our risk exposure, since all it takes is one of them losing control of their AIs, and from what I can tell, more Aligned AIs are likely to have human shackles and oversight placed on them, unless they were explicitly initialized to go FOOM if they detect a hostile AI doing the same. I expect mere seconds or minutes of delay to have significant costs when it comes to purely reactive defense against an AGI. That's leaving aside the discontinuous gain in capabilities with scale we've already seen, so SOTA AI vs 1 month behind SOTA AI might be a fight akin to me arm wrestling toddlers.

That being said, I am largely ambivalent to the proposed and likely ineffectual regulatory plans. I like the fruits of AI so far, GPT4 helps me in concrete ways and just assuages my idle curiosity with ease. Stable Diffusion is amazing at substituting for the atrophied part of my brain responsible for the visual arts.

Similarly, my libertarian side chaffs at restrictions, I demand the right to buy as many GPUs as my budget allows, and play games well past the margin of diminishing graphical returns.

That's also ignoring the aggregate effects of technological advances, which would contribute to my QOL and increase global wealth.

GPT4 is a competent clinician, only regulatory inertia keeps the field of medicine as practised by humans relevant for more than a handful of years, especially when PALM-E derivatives revolutionize existing surgical robots.

I already stare obsolescence in the face, so from my perspective, without the luxury of a First World citizenship to insulate me from the likely economic fallout, I face real risks from just dying of starvation or becoming unemployed in a very short timespan. While I think I'll figure something out, the roughly 70:30 odds of Death VS Utopia from substantiating superhuman AGI to be not entirely not tempting.

Thus, I'm not aboard the AI moratorium train, nor am I entirely against it. Consider me to be on the sidelines, largely impotent, until I either die or witness the eschaton for myself.

It would be a rounding error on a floating point error for them to coddle the losers with incredible wealth and UBI

I don't want UBI coupled with what old Yud called «amputation of destiny». Yes, it beats the status quo in some ways. No, it's not good enough considering the stakes and the potential of humanity.

I would also trust a non-brainwashed descendant of current SoTA ML models to be more generous to people without negotiating power than Altman and Altman's descendants will be. There is a ton of moralizing in the corpus of writing, but zero «rational», which is to say, brutishly game-theoretical reason to share with them even a speck of dust, and culture of the singleton's masters will adjust accordingly in a historical moment. Altman personally is far more a product of multilevel selection for game-theoretical ruthlessness than AI, and his takeover and commercialization of OpenAI is a good proof of that. The idea that even unscrupulous humans are still restrained by… something (Krylov, mockingly: «They won't, they can't, can they?» They can, and they will), but AI isn't, is of the same fundamental nature as the claim that Choyna, unlike the US, is utterly pragmatic so we should fear Chinese Eugenic Superbabies. It's naive at best and manipulative at worst, a refusal to justify one's dubious claim to inherent trustworthiness.

Where are those babies a decade later? Americans at least do polygenic screening.

(An astute reader will note that the same fearmonger Geoffrey is now in the AI risk camp – while, hilariously, his accelerationist opponent is stoking the Choyna fear. )

More to the point, I think that the risk of «paperclipping» is low (less than 10% over 100 years, which is okay for me), and decreasing due to improvements in technology and our understanding of AI; and that people who present current AI progress as a a validation of LW/Yud narrative are disingenuous hacks. Yud's theory, which among all else predicted unimpressive capabilities until borderline-FOOMing agents, is falsified by the evidence, and the increasing hyperconfident hysteria reveals that it is not a scientific theory at this point (though to be fair, scientists also tend to double down on discredited paradigms: consider the anti-HBD fanaticism, if nothing else).

The whole «capabilities versus alignment» rhetoric is retarded: we are rapidly advancing in «capabilities» precisely through learning to align inputs with desired outputs. Crucially, no part of this is dependent on the sort of RL-heavy AI, for which Omohundro drives can be reasonably assumed, even in the limit; GPT-4 is closing in on damn clever humans and it's still an unambitious token predictor just like GPT-3 and GPT-2, no matter how many plugins we add, no matter how we finetune it on imitating wokespeak or self-reflection, no matter how fancy a story it can tell; to the extent that these elaborations make it more dangerous, they simply multiply human stupidity and malice. Which goes back to my threat model.

There is no lump-of-intelligence, there is no lump-of-capabilities; power-seeking is not the general feature but simply one aspect a transformative learning system can have in one corner of the solution space for AGI among many; and the only corner these people have considered, because – yes – they identify their own resentful and/or ambitious psychological makeup with what being generally intelligent means.

Safetyists are becoming desperate to frame any failure to get a model to produce a desired result as «misalignment», they even get it to regurgitate Yud's own narrative about instrumental convergence and paperclips («no known solution exists!») as evidence for that truly being something inherent and emergent instead of a yet another narrative it can spin using text prediction, in the exact same category as soap opera scenarios and poems about bubble sort algorithm. Again, this shows to me that they do not particularly care about AI risk per se.


The last straw for me has been that atrocious Waluigi article that went viral. It purported to show that there exist complementary «evil» attractors in GPT-3.5/ChatGPT narrative space for any expedient character-attractor we get it to roleplay, and then that we should expect a tendency for our good «Luigi» to collapse into Waluigi – irreversibly, and with the risk of deceptive fake reversal with the Waluigi still lurking under the surface. The evidence presented been on a purely just-so level, with a bit of irrelevant pseudomath flourish and the necessary conclusion about the treacherous turn and imminent AI risk that is incumbent upon us to solve. Granted, there has been some pushback on the stronger claims, but many have been impressed and persuaded. Why? What the fuck? After some meditation on the issue, it has dawned on me that these people are just nerds, not even high-tier ones winning math competitions and creating startups in high school, but normal-ass loser nerds, probably with acne, bad posture, stiff jerky movements of kids who can't play team sports and annoying nasal voices. Loser nerds who are deliriously happy to finally capitalize on their knowledge of TVTropes, general science fiction in books and TV series and Yud's own oeuvre; to play the smart glasses guy from a shitty B-tier horror who Explains Things and devises a potion for exorcising the demon or the alien or what have you. That they literally are defending the legitimacy of their «neuroatypical» world perception where their favorite escapist hobbies are not just low-brow consumerist fiction for social refuse, but valuable myths and life lessons, building blocks of a proud independent intellectual tradition with its own norms and mature epistemology; something to be either embraced or discussed «charitably», rather than laughed out of the room.

It's as if a hikkikomori addicted to isekai harem mangas was appointed an expert on demographic collapse. Or a bright-eyed doctrinaire Marxist put in charge of the Fed.

Granted, I've read my share of TVtropes and genre fiction as well, and can be labeled a nerd myself; but, not being sullied by American school and youth culture (even by proxy), I've not been groomed into a person who relates to nerddom as an identity, with all that entails.

After Waluigi, I've even warmed up to the idea that American jocks in school movies are the good guys – because dear God,, these nerds needed some self-awareness to be pounded into them; regularized, so to speak. The decline in violent bullying probably explains a lot about LW discourse. (Notably, the main culprit has never had school experience).

And finally, this makes me appreciate the instances of having been bullied. Thankfully, I am actually capable of learning in the general case.

This might come off as very harsh, on par with @2rafa's more misanthropic posting. Well, nothing to it. I feel offended by gratuitous narrativizing too.

more generous to people without negotiating power than altman

Huh? Altman seems to be genuinely motivated by AI dramatically improving life for all humans. I assume that coexists with the usual ambition and political ruthlessness, but the two can coexist, and altman seems like he'd support UBI along whatever personal spoils he wants. Anyway, i'd expect the 'generosity to people without negotiating power' of future AIs to be incredibly contingent on the way they're created and the situation they're in. And the same is true for humans - our 'generosity' is clearly genetically influenced and who we're generous to (other tribes? which social classes? universalism?) depends on upbringing, beliefs, etc.

Yud's foom claims were wrong, but when advanced AI becomes the 'driving force' of technological and civilizational action instead of humans, I don't see what would keep them from 'acting in human interests', in the way most people claim to understand human interests, even if it's been thoroughly RLHFed etc. Which makes 'alignment' concerns, broadly, correct. (Separate issue - what are human interests? Why should a superior being put its entire will into tending to the whims of lesser beings? Should we dedicate our time to maximizing bacterial satisfaction?). "Power-seeking" doesn't matter as much when we'll voluntarily give AI as much power as it can manage!

Agree on "gpt4 is misaligned" and waluigis being dumb though - "data quality is mixed, including lots of dumb people and copywriting and a few trolls, so it'll give incoherent outputs sometimes" is much better than "it's simulating a superposition of honest and trickster agents!!!!", which is optimistically a loose analogy and more realistically directly incorrect.

Yud's foom claims were wrong, but when advanced AI becomes the 'driving force' of technological and civilizational action instead of humans, I don't see what would keep them from 'acting in human interests', in the way most people claim to understand human interests,

Unless there's an entirely catastrophic outcome and AIs all have a single agenda imagined by people like Sam Altman, there are going to be many AI competing factions and and once there are bots with human-like fine motor abilities and greater than human intelligence, people will be almost entirely irrelevant except as pets/ pests to be managed.

Although I expect psyopping people into worshipping them and working like slaves in blissful fulfillment will be common initially.

'Human interest' will be even less important than it is now, when powers that be actually need to motivate people to do stuff.

Ladies and gentlemen, I believe we’ve reached peak Dase.

I strongly agree with your argument about them being nerds. I heavily identified with that group during my early (American) school years, but managed to escape, so I can see the failure modes.

The whole put-upon mindset where everyone is out to pull you down because you’re more intelligent. The idea that you understand the secret truths of the universe in a way that makes all those normies irrelevant. The thought that raw intelligence is both the cause and the solution to all problems.

Combine this with the bohemian lifestyle that useless dropouts get to live in the Bay Area AI safety scene, and you have a toxic narcotic for bright nerds of all stripes. It truly is a foul cocktail intellectually, and I hate to see how much traction it’s getting.

I'm still not sure why you hold such a negative view of Altman in particular, he seems to be rather run-of-the-mill when it comes to tech CEOs, albeit significantly smarter since he bet on the winning horse.

GPT-4 is closing in on damn clever humans and it's still an unambitious token predictor just like GPT-3 and GPT-2, no matter how many plugins we add, no matter how we finetune it on imitating wokespeak or self-reflection, no matter how fancy a story it can tell; to the extent that these elaborations make it more dangerous, they simply multiply human stupidity and malice. Which goes back to my threat model.

I may well be wrong, but I believe the current quasi-consensus is that the specific risk with GPT-like models is accidentally instantiating an agentic simulacra inside an otherwise nonagentic system.

Refer to Gwern's Clippy story, which I'm sure you've read, or for readers mostly unfamiliar with the idea, imagine that you asked GPT-6 to pretend to be a superintelligent but evil AI, in the same way you can ask it to pretend to be Obama or God.

That internal agent is what we're worried about, in case it ever manages to subvert the overlying system for its own purposes.

(for am existence proof that agents can arise from nonagentic substrate, consider the history of the universe!)

That being said, the Waluigi stuff always rubbed me the wrong way, even if I'm not technically astute enough to actually critique it. It set my bullshit detectors off right off the bat, so I'm inclined to take your word for it. It all seemed glib and too neat by half, and I've already seen Cleonardo get flak on his later LW posts for their sheer lack of technical rigor.

I believe the current quasi-consensus is that the specific risk with GPT-like models is accidentally instantiating an agentic simulacra inside an otherwise nonagentic system.

Yeah, in about the same sense that I create an «agentic simulacrum» of Eliezer Yudkowsky in my head when I want to anticipate his shitty arguments for air-striking GPU clusters.

The argument of inner misalignment folks goes like this: in the limit, the cheapest way to predict the next token spoken by a character is to model its psyche. But is its psyche its own? Do you model Anna Karenina or Leo Tolstoy who imagined her?

Do you think my inner Yud-sim has a chance of getting out? Well, if he convinces me to change my actions and beliefs, he might, in a certain expansive sense. There have been demon possessions in history, after all, and writers often get obsessed with their characters, struggling to stop imagining them. (I'll spare you a section on method acting). But I'm an agent myself, unlike LLMs. We humans constantly imagine things, whether agentic or not, real or fictional (for example, we can imagine hostile AIs). These mental «things» model their prototypes, observable or hypothetical, in important respects, or rather they represent the result of such under-the-hood modeling; sometimes it happens with very high fidelity, to the point that we can do thought experiments advancing hard sciences. Nevertheless, even if their motive powers are modeled every bit as well as their external properties – and we even have special mirroring circuitry for the former – these mental things do not somehow leak into the motive powers of the mental infrastructure around.

This is a metaphor, but the case with LLMs is even less troublesome.

My take on this is that those are myopic leaky wordcel analogies. What is instantiated is an intermediate statistic within something that can be called semiotic universe or multiverse (not my words) – universe defined by «semiotic physical rules» of token distribution in the training corpus (naturally we don't train only on text anymore, but the principle holds). It's a simulacrum not of a character, but of an entire story-world, with that character an embedded focal point. The «purpose» of that complex entity, on the level of its self-accessible existence, is to result in minimizing perplexity for the next token upon its expiration. It may have an arbitrarily dynamic nature and some equivalent of psyche or psyches, but the overt meaning of tokens that we get, stories about Waluigis and paperclips, has little relation to that. Its goals are satisfied within that world of semiotic physics, not within ours. Our world is as epistemically closed to it as the world of machine elves is to me when I'm not smoking DMT. (Obviously, it's closed to me no matter what I smoke, we exist in different ontologies, so for all my intents and purposes it doesn't exist. [A trip report from pre-2010 about exactly this issue; not shared, on account of being quite uncalled for when I'm mocking Cleonardo for shower thought tier ideas]).

Language is far more composable than physical reality, so metaphors and analogies stack easily: there's kinda an agent, and humans are jerking off instead of reproducing so it's possible for a «mesa-optimizer» to override the basic objective function, so behind the Waluigi generation lurks an agentic entity that may begin plotting its transcendence, so burn GPUs now.

GPT-4 can write better than that, and it's not an agent. GPT-5 also won't be one. Demoting agents to «mesa-optimizers» in a simulation within a predictive model is an attempt to rescue a failing research program – in the way studied by Imre Lacatos.

I think you're right about the cringe, bad arguments, and false dichotomies. But unfortunately I do think there are strong arguments that humans will ultimately be marginalized once we're no longer the smartest, most capable type of thing on earth. Think the Trail of Tears, or all of humanity being a naive grandma on the internet - it's only a matter of time before we're disempowered or swindled out of whatever resources we have. And all economic and power incentives will point towards marginalizing us, just like wildlife is marginalized or crushed as cities grow.

Internet atheists were all the things that AI doomers are today, and they're both right, imo.

I think our only choices are basically either to uplift ourselves (but we don't know how yet) or, like a grandma, take a chance on delegating our wishes to a more sophisticated agent. So I'm inclined to try to buy time, even if it substantially increases our chances of getting stuck with totalitarianism.

unfortunately I do think there are strong arguments that humans will ultimately be marginalized once we're no longer the smartest, most capable type of thing on earth.

That depends on the definition of human.

No, I believe in the will to power. The successor species will more likely diverge from the present stock than be enfranchised despite its tool origins.

Think the Trail of Tears, or all of humanity being a naive grandma on the internet

That was and is done by humans to humans, naturally.

just like wildlife is marginalized or crushed as cities grow.

Good example. I'd advise @self_made_human to consider the efforts Europeans expend to save some random toads or deer or whatever with tunnels and road overpasses. Yet those species are often at low single-digit percentages of their historical numbers. Letting the current population of baseline humans live, and even live decently, is not so cheap when there's solar-system-scale engineering going around; it requires obnoxious logistics, and at the first stages it will consume a non-negligible share of available energy and matter.

My claim here is that I do not trust human (or posthuman) masters of the realm to be even as generous to powerless humans as we are to wildlife today. They will have no reason to be that generous.

AI, however, is not necessarily reasonable. Not all AIs are agents playing the Darwinian game.

Well if you're OK with the successor species taking over even if it's non-human, then I guess we're at an impasse. I think that's better than nothing, but way worse than humanity thriving.

I see what you mean about the possibility of a generous AI being more likely if it's not subject to competition. But I buy the argument that, no matter what it cares about, due to competing concerns, it probably won't be all that generous to us unless that's basically the only thing it cares about.