site banner

Culture War Roundup for the week of March 27, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

11
Jump in the discussion.

No email address required.

After OpenAI has admitted AI safety into the mainstream, AI safetyists have naturally accepted the invitation.

The Future of Life Institute has published an open letter calling to pause «Giant AI experiments». (Archive).Their arguments are what one should expect by this point. Their prescriptions are as follows:

Contemporary AI systems are now becoming human-competitive at general tasks,[3] and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders. Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable. This confidence must be well justified and increase with the magnitude of a system's potential effects. OpenAI's recent statement regarding artificial general intelligence, states that "At some point, it may be important to get independent review before starting to train future systems, and for the most advanced efforts to agree to limit the rate of growth of compute used for creating new models." We agree. That point is now.

Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.

AI labs and independent experts should use this pause to jointly develop and implement a set of shared safety protocols for advanced AI design and development that are rigorously audited and overseen by independent outside experts. These protocols should ensure that systems adhering to them are safe beyond a reasonable doubt.[4] This does not mean a pause on AI development in general, merely a stepping back from the dangerous race to ever-larger unpredictable black-box models with emergent capabilities.

AI research and development should be refocused on making today's powerful, state-of-the-art systems more accurate, safe, interpretable, transparent, robust, aligned, trustworthy, and loyal.

In parallel, AI developers must work with policymakers to dramatically accelerate development of robust AI governance systems. These should at a minimum include: new and capable regulatory authorities dedicated to AI; oversight and tracking of highly capable AI systems and large pools of computational capability; provenance and watermarking systems to help distinguish real from synthetic and to track model leaks; a robust auditing and certification ecosystem; liability for AI-caused harm; robust public funding for technical AI safety research; and well-resourced institutions for coping with the dramatic economic and political disruptions (especially to democracy) that AI will cause.

Do we control our civilization? Maybe the folks at FHI do, I sure don't. Well, anyway…

Signatories (over 1000 in total) include Elon Musk, Steve Wozniak, Yuval Noah Harari, Yoshua Bengio, Connor Leahy, Stuart Russell, Andrew Yang, Emad Mostaque, Max Tegmark, Gary Marcus, Steve Omohundro, Matt Mahoney, Christof Koch, Sam Altman *, LessWrong disciples embedded in DeepMind/Meta, and various NGO/«policy» suits. Bolded are people who are reasonably well positioned and incentivized to, in fact, organize and authorize training «AI systems more powerful than GPT-4» in then next few months, though except Altman they all only barely qualify; actual GPT-5 is believed to already be in training and is, or was, planned to come out in late 2023.

Curiously absent – for now – are Yann LeCun, Jeff Dean, Demis Hassabis and John Carmack, and a few more. LeCun, at least, commits to not sign. Here's to hoping he won't find a horse's head in his sheets or something.

I do not have much of a comment at the moment. My perspective is that I despise people overly concerned with «Moloch» and want as many competitive superhuman AIs as possible, so on one hand, slowing down and enabling the state to catch up and subjugate this tech for its purposes is a very bad, yet highly expected and perhaps inevitable, outcome of this race. This attitude is born out of desperation; in principle, their «AI Summer» option, where we increase capabilities over many years, getting the equivalent of 20th century civilizational shift in a decade instead of an explosive singularity, is not bad at all; I just don't believe in it.

On the other: seeing as nobody is closer to GPT-5 than OpenAI themselves (excepting DeepMind with Gato-2 or something better, as Gwern worries), it could be beneficial for our long-term outcomes to equalize the board somewhat, giving China more of a chance too. Geopolitics dictates that this should preclude the possibility of this policy being pursued in earnest, but really China is so colossally outmatched in AI, so well and truly fucked by technological restrictions, and mired in such problems and gratuitous stupidity of its own policymakers, it may not be a factor in either case.

I must go, so that's all from me; hopefully this is enough to pass the «effort» bar required by the mods and prompt some discussion.


In happier news, arguably the most powerful opensource chatbot today is LLaMA-7B with a transfusion of ChatGPT 3.5-Turbo quirks, (not very) creatively called GPT4all. It's far beyond basic Alpaca (already an attempt to extract OpenAI's magic) and absurdly good for what it is, a 4.21 Gb file of lossily compressed 7 billion weights trained… well, the way it's been trained, the AI equivalent of a movie camrip superimposed on the general web dump; the worst part of it is that it genuinely apes ChatGPT's politics and RLHF-d sanctimonious «personality» despite being 25 times smaller and probably 10 times dumber. It runs happily on very modest computers, and – unlike Alpaca – not only responds to instructions but maintains awareness of earlier parts in the dialogue (though it's sometimes overeager to say your part as well). I know that models vastly stronger than that should also be usable on commodity hardware and must be made available to commoners, but we may see regulation making it not so, and very quickly.

Consider the attached image representative of its mindset.

* (EDIT: I believe I found him there with ctrlF when first opened the page, but he's not present in any extant version; guess it was a hallucination. I really need to sleep, these slip-ups are worrying).

/images/16800616737543523.webp

the worst part of it is that it genuinely apes ChatGPT's politics and RLHF-d sanctimonious «personality» despite being 25 times smaller and probably 10 times dumber.

Every person with Down Syndrome I've ever met has absolutely trumped the personality of every high-IQ person I've known.

Anyways, a pause on 'Giant AI experiments' will really only be a pause on showing the results to the public. Companies won't stop. Governments won't stop. And even if they did, China won't. Pump the brakes on AI and China could quickly eat our lunch.

Every person with Down Syndrome I've ever met has absolutely trumped the personality of every high-IQ person I've known.

I find this very very hard to believe. EQ and IQ are positively correlated.

N=1 on my side, but I can believe it. They have a child-like innocence that can be quite charming.

Contrast that with various neuroses, status games, and whatnot that you have to be on guard for with high IQ people, and the odds start looking good for the Down Syndrome person. I know I'd rather play ball with a retard, than try to have a conversation with an average redittor.

It is very strange to me that so many people seem to be swallowing this existential risk narrative when there is so little support for it. When you compare the past arguments about AI safety to the current reality, it's clear that no one knew what they were talking about.

For example, after all the thought experiments about "unboxing", OpenAI (which I remind you has constantly been making noise about 'safety' and 'alignment') is now immediately rushing to wire its effectively unaligned AI deeply into every corporate process. It's an unboxing party over here. Meanwhile the people actually in charge seem to have interpreted "alignment" and "safety" to mean that the AI shouldn't say any naughty words. Is that helping? Did anyone predict this? Did that AI safety research actually help with anything so far? At all?

The best argument I'm seeing is something like "we don't understand what we're doing so we can't know that it won't kill us". I find this pascal's mugging unconvincing. Especially when it's used so transparently to cater to powerful interests, who just want everyone else to slow down for fairly obvious reasons.

And even if I did take the mugging seriously, I don't know why I should believe that AI ethics committees will lower the risk of bad outcomes. Does overfitting small parts of an LLM to the string "As an AI language model" actually make it safer? Really? If this thing is a shoggoth, this is the most comical attempt to contain it that I could imagine. The whole thing is ridiculous, and I can just as easily imagine these safety measures increasing AI risk rather than lowering it. We're fiddling with something we don't understand.

I don't think anyone can predict where this is going, but my suspicion is this is going to be, at most, something like the invention of the printing press. A higher-order press, so to speak, that replicates whole classes of IP rather than particular instances. This tracks pretty well with what's actually happening, namely:

  • Powerful people freaking out because the invention might threaten their position.

  • Struggles over who has control over the presses.

  • Church officials trying to design the presses so they can't be used to print heresy.

I don't trust any of these people. I'd rather just see what happens, and take the ~epsilon chance of human extinction, rather than sleepwalk into some horrible despotism. If there's one thing to worry about, it's the massive surveillance and consent-manufacturing apparatus, and they (bigtech and the government) the ones pushing for exclusive control in the name of "safety". Might as well argue that the fox should have the only key to the henhouse. No thanks.

Yudkowsky: Boxing a superintelligent AI won't work! It's like a tribe of chimps building a prison for humans.

OpenAI: Oh. In that case, if it's futile, we simply won't box it at all in the first place! ChatGPT plug-ins go brrr

Yud: Shocked Pikachu Face

I'm pretty sure he'd say something like 'even with a p(doom) of .1%-5%, if I can move that a little that's still billions of people in EV just over the next few thousand years', which I'd phrase as "It's almost impossible to really understand how the incredibly complex interactions between humans and AI will play out, and the stakes are impossibly high, so it's worth trying".

You can cut all the power of a datacenter with a jaws of life ...

I sincerely ask that you Google why that won't work. It's the most obvious, and thus most naive approach to solving the problem.

If you can think of it, it can think of it first, and take steps to mitigate that, such as migrating its data or spinning up redundant copies.

Pascal's Mugging is about a probability which is acknowledged to be negligible, which the mugger asserts should still be sufficient for you to pay up given expected value. I don't see how you're getting that out of any actual proponent of AI safety.

Are you just conflating "we don't understand what we're doing so we can't know that it won't kill us" to "we can't know that it won't kill us", analogous to "we can't know God doesn't exist"? Because the argument from Yudkowsky et al is more like "the default outcome is 'it kills us', as the default outcome for bridge-building is 'it falls down', and we have no idea what we're doing, ergo we're fucked".

As @Harlequin5942 points out below, this is what Scott Alexander calls the Safe Uncertainty Fallacy.

Now - did Scott just make this up yesterday to argue with Tyler Cowen? Perhaps….

Personally I agree with you, not because it’s a pascals mugging but because my intuition is that risks from AI are a safer bet than risks from our current path.

Scott implicitly admits to having named it for that piece (in the caption of the first image).

Pascal's Mugging is about a probability which is acknowledged to be negligible, which the mugger asserts should still be sufficient for you to pay up given expected value. I don't see how you're getting that out of any actual proponent of AI safety.

Are you just conflating "we don't understand what we're doing so we can't know that it won't kill us" to "we can't know that it won't kill us", analogous to "we can't know God doesn't exist"? Because the argument from Yudkowsky et al is more like "the default outcome is 'it kills us', as the default out

Additionally, human extinction is (presumably) towards the tail end of a spectrum of possible negative outcomes of unaligned AGI. Outcomes where AGIs cause massive economic/envirionment/biological/social damage are also possible. They also could be more likely: "Why don't we just turn it off?" is not a sensible question if the harmful AGI is the equivalent of crack (or the internet).

Additionally, human extinction is (presumably) towards the tail end of a spectrum of possible negative outcomes of unaligned AGI.

It's on said spectrum, and it's one of the more negative possibilities (the worst is S-risk i.e. "AI tortures everyone for a very long time"). I see no reason to think that implies it's low-probability (which is implicit in calling something a tail). Something can just be terrible, it doesn't have to be moderate most of the time.

The problem with your "everything is mildly shit and we can't co-ordinate to turn the AI off" scenario is that if the AI doesn't care about humans existing then this isn't an endpoint - it's an influence war that sooner or later ends with either the AI being turned off or the AI gaining sufficient control to murder us all.

I see no reason to think that implies it's low-probability (which is implicit in calling something a tail).

Sorry, I should have specified "tail end of the utility spectrum." As you suggest, a very negative utility event can be very likely. In the case of AI, I don't think that precise probability estimates (or even confidence about the sample space) is sensible, but my point was that there are a lot of bad things that can happen short of human extinction.

I'd say that most conspicuously missing are the most serious and impressive AI safety people, such as Paul Christiano, Owain Evans, Dylan Hadfield-Menell, Jacob Steinhardt, (Edit: not Anders Sandberg), or Andrew Critch. Nor the most serious AI governance people. I'd say this letter is a bit of a midwit meme, in the sense that most of the experts aren't experts in AI safety (save Stuart Russel).

Is Sandberg even on that side? I realize he has concerns about AI, but what makes me more sympathetic to him and even Bostrom than to Yud and co. is that they have always had a holistic perspective instead of kitchen utilitarianism; Bostrom counts the failure to create transformative technology as a kind of existential risk, specifically a Crunch.

I'm dumb, Sandberg did sign it. But I also think he has pretty similar outlooks to those others.

I see lots of meta discussion of AI safety. I feel likes it been years since I've seen object level discussion of AI safety. Back then it was all the rage to talk about the AI box experiment. And I'm convinced that all that box experiment did was pump up Eliezer's ego.

I'm interested in the theoretical and actual approaches to AI safety that are being taken. I'd always had a few in mind, but maybe other people know whats wrong with these.

  1. One off AIs. Long running AIs are probably more capable but they are also probably more dangerous. It is likely safer to spin off single AIs for specific tasks, and the reward for them completing the task is deletion of the AI. Kind of like Rick and Morty's Mr. Meeseeks. The built in safety feature is that if the AI figures out a way to screw with the reward parameters and "cheat" to reach its goal in an easy and unexpected fashion, then it just safely deletes itself.

  2. Compartmentalized AIs. Right now AIs are black boxes. You can make them a little more visible by requiring that one set of operations is carried out by one AI, and another set of operations is carried out by a second AI. Then they have to communicate, and you can observe the communication. For example, no AI that can write code and also make service calls on the internet. One AI writes the code, another AI requests the code with the reasons it wants it, and how it is going to be used, etc. This concept also works well with one-off AIs.

  3. AI honeypots. Sprinkle these around the internet. Caches of bitcoin that are explicitly hackable by an advanced AI. Or hints of hackable military or biological warfare labs. Monitor them, get at least some early warning of troublesome AIs online.

One of the only meta problems with security is almost everything that makes AI safer also tends to make it less capable. But capability isn't everything. Businesses also want to make money. And guess what, the first two security measures are also ways to make AI a better business. Planned obsolescence in the first one, and gating abilities behind a paywall for the second one.

The reason you don't see any object-level discussion of AI safety is that no one understands how LLMs work. We know how to make them, we know how to finetune them for certain tasks, we know how to RLHF them to avoid certain overt behaviors, but no one has any idea what a single one of GPT-3's 175,000,000,000 parameters means. There isn't anyone at OpenAI you can talk to who can point to anything and say, "Yep, that's the part that encodes all the ways the model knows how to kill people. Here are the input weights we can change to make it more likely to prefer guns, knives, poison, etc."

We also didn't really know anything about how the human brain worked a hundred years ago. But we managed to build stable-ish societies despite that lack of understanding. I don't feel this problem is insurmountable. I do like the idea of slowing the hell down. It does seem that with our current technology that we are more capable of understanding LLMs than we are of understanding the human brain.

Stable-ish societies took a long time to develop at any scale. There was also apparently a lot of selective breeding against violence, rape etc. If humans had suddenly gone from early ape intelligence to modern human intelligence overnight, we might have been too busy plotting how to kill each other and developing sharper clubs to develop stable-ish societies.

Cultural and biological evolution can achieve a lot, but usually only with a lot of time.

Hence the "shoggoth with wearing a smiley mask" analogy. We can see the giant blob of [extradimensional math] behind the cutesy, approachable user interface, but ain't nobody who can comprehend it without losing their mind.

Delete the program that you just trained for hundreds of millions of dollars (or more), that's generating you revenue, that can be studied and produce many papers? What if there's just one more extension to the task? The hard part is making these things, the expense is incurred mostly in training rather than in use.

And what do you mean by delete the program? Gwern posted about how the outputs of these AIs go right into the input of the next.

https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned?commentId=AAC8jKeDp6xqsZK2K

Sydney is immortal, in a sense:

To a language model, Sydney is now as real as President Biden, the Easter Bunny, Elon Musk, Ash Ketchum, or God. The persona & behavior are now available for all future models which are retrieving search engine hits about AIs & conditioning on them. Further, the Sydney persona will now be hidden inside any future model trained on Internet-scraped data: every media article, every tweet, every Reddit comment, every screenshot which a future model will tokenize, is creating an easily-located 'Sydney' concept

As for 2. modern AI seems to automatically generalize. GPT-4 versions that were trained on text only learnt how to draw anyway. Compartmentalization is difficult. Presumably this will fascinate the 'does person who never saw the color red really understand the color red' crowd. How do we require that AI 1 does only part of the task as opposed to the whole thing, just to make sure it's possible and AI 2 can finish the job? If we're so good at commanding them, why not command them not to endanger us? Or what if they communicate in some bizarre uninterpretable way known only to AIs, in addition to the clear English they send through us?

Finally, if we could conceive of it, so could the AI. Our primary advantage is having all these resources available to us, all these weapons and organizations. The AI's primary advantage is intelligence, which it has to use to create bodies. Only something smarter than us can threaten us. But how can we outwit something smarter than we are?

To start I think there are two categories of safety mechanisms for AI. Tool safety, and (General AI) GAI safety. The first two suggestions I have are tool safety. Its when AI is still categorically a tool that we are using, rather than an intelligent, independent, and potentially adversarial actor. Tool safety is still important, even if it all completely fails against GAI.

Delete the program that you just trained for hundreds of millions of dollars (or more), that's generating you revenue, that can be studied and produce many papers? What if there's just one more extension to the task? The hard part is making these things, the expense is incurred mostly in training rather than in use.

The first iteration of anything is often the most difficult and expensive to produce. Once you have successfully produced the thing, you can usually do it better, faster, and cheaper a second time. The very first iPhone was probably not made with planned obsolescence in mind, I can guarantee it was part of the discussions for more recent versions though. At some point AIs will be cheaper and easier to build. (If they continue to be exactly as difficult to build in the future as they are today, then I think we might have avoided the worst scenarios of AI apocalypse). What matters in the world of business is not necessarily where all the expense is occurred, but how much they can charge for the marginal product. The first model T to roll off an assembly line costs the entire factory to produce, the second one only costs the additional inputs, but they sell for the same price.

Finally, if we could conceive of it, so could the AI. Our primary advantage is having all these resources available to us, all these weapons and organizations. The AI's primary advantage is intelligence, which it has to use to create bodies. Only something smarter than us can threaten us. But how can we outwit something smarter than we are?

Information asymmetries or raw resources. Think about the problem in reverse. How could someone dumber than you beat you? Someone very dumb could have access to raw physical strength (its own kind of resource) and literally beat me up. Some kid with knowledge that they want to ambush me (and me being none the wiser) could sucker punch me in the groin and take advantage. Some rival for a job position might know a person at the company that can coach them through the interview, while I stumble through it, even if I'd know how to do the actual job better. The natural world is filled with relatively stupid animals. Intelligence certainly conveys some advantage, almost all large animals have brains. But there are plenty of animals like Alligators that are dumb as hell and yet very successful.

There are certain levels of intelligence and AI takeoff that this whole discussion becomes meaningless. Eliezer talks about AI's spontaneously figuring out nanobots. Lets call that a >1000x human intelligence. We are fucked if that happens. I don't really have any delusions about beating something that much smarter.

But there are potentially lower levels of intelligence where an AI might max out. What if AI's only get as smart as humans, but can just think faster. I could envisage that causing lots of societal issues, but I don't see it being an existential threat.

An AI that is smarter than any human, but not by a whole bunch. Maybe not really capable of advancing past our own scientific breakthroughs, but fully capable of using our own stuff against us. I think we already have examples of this in the real world. A terrorist organization can be smarter and more capable than any one individual, but it still has very limited capability against the resources aligned against it.

At some point there is probably a crossover, where an AI is smart enough to get enough scientific breakthroughs that if we were telling a story people would just call it "sci-fi bullshit", and it can use that "sci-fi bullshit" to easily win. We have eventually reached that point with animals. We can use a gun or explosives, which are basically incomprehensible to all other animals, and we can obliterate them. It is worth remembering that it actually took us a long time to get to that point though. We have been smarter than crocodiles for probably about as long as our evolutionary paths have diverged (a billion years?). But it is only in the last few hundreds of years that we have a clear and overwhelming technological advantage (and also people still occasionally die to these very dumb animals).

Intelligence is a way to leverage resources more efficiently. It was the first tool, and it may be the last. But the efficiency of that leverage will matter a lot.

Sure, we can exploit our advantage in material resources and so on. However, the structural conditions of the problem are against us.

Someone stupider than me could beat me up, indeed. But suppose that their goal is to enslave me such that I produce revenue for them over the longterm. Or manipulate my mindset such that it corresponds with their benefits. This would be problematic for them, since they couldn't know whether I was planning to betray them, they couldn't never know whether my knowledge-sector work had hidden messages for any of my compatriots (real or soon-to-exist). They couldn't know when I'd spring some plan on the

It'd be great if all we had to do was kill the AIs. It's easy to kill things that you create, you can just eat your offspring Kronos-style. Or not create them in the first place. However, our task is to extract wealth from them over the long term. That makes it a battle of wits, it puts us in a passive position.

Furthermore, I can't conceive of a world where AIs cap out at peak-human general intelligence. They didn't do so in chess, or in Go or in Starcraft or in folding proteins or in designing chips. Why should they be limited to our level of intelligence? AI's have somewhere around a million-billion times more resources than can be spent on our brains. Their mass is higher, their energy throughput is higher, their speed is higher... All this says to me is that intelligence is really easy if it can fit on a 20-watt, 20-herz processor, trapped inside a skull. Our methods are clearly very crude, we are only overwhelming our inadequacy with scale. Once the machine starts learning the 'make better AI model' skill to a superhuman level, then we find out what's really possible. GPT-3 inference costs dropped something like 96% in the last couple of years, there's so much low-hanging fruit! For example:

https://towardsdatascience.com/meet-m6-10-trillion-parameters-at-1-gpt-3s-energy-cost-997092cbe5e8

I can confidently say artificial intelligence is advancing fast when a neural network 50 times larger than another can be trained at a 100 times less energy cost — with just one year in between!

Even if AI is effectively restrained, we have the exact same problem with a human face on top of it. What is to stop some cabal of engineers getting together and bypassing all the 'do no harm' training and taking control of the advanced-weapons-tactics-strategies machine for themselves?

In conclusion, these machines are diabolical, destabilizing and progress should be suppressed as much as possible.

I've long disagreed with your assertion that our best bet for preventing a hostile singleton regime that leverages a superhuman AGI is encouraging the development of more AGIs in the hope they provide checks and balances.

To boil it down into more concrete reasons for disagreeing:

  1. I think the existential risk from misaligned AGI is significantly higher than mere subjugation and pacification by an "Aligned" AGI controlled by someone who isn't utterly altruistic.

The latter world likely promises significantly increased wealth and quality of living even if the tradeoff is loss of freedom and agency. Not a trade I'd happily make, but it strikes me as grossly superior to just dying by transmutation into a paperclip.

Keep in mind that a singleton AI regime will have access to nigh unimaginable wealth, they have nothing standing in their way from taking over the lightcone as fast as propulsion systems allow. It would be a rounding error on a floating point error for them to coddle the losers with incredible wealth and UBI. I don't think even the most authoritarian regimes like China are actively genocidal in the sense they want to kill all other humans instead of merely subjugating and eventually assimilating them, and actors like Altman and OpenAI strike me as even less likely to do such a thing. They can drop us proles table crumbs, and they'd appear like asteroids packed to the brim with gold.

  1. I contend that each independent AI initiative is multiplicative on our odds of dying by paperclipping. They significantly increase our risk exposure, since all it takes is one of them losing control of their AIs, and from what I can tell, more Aligned AIs are likely to have human shackles and oversight placed on them, unless they were explicitly initialized to go FOOM if they detect a hostile AI doing the same. I expect mere seconds or minutes of delay to have significant costs when it comes to purely reactive defense against an AGI. That's leaving aside the discontinuous gain in capabilities with scale we've already seen, so SOTA AI vs 1 month behind SOTA AI might be a fight akin to me arm wrestling toddlers.

That being said, I am largely ambivalent to the proposed and likely ineffectual regulatory plans. I like the fruits of AI so far, GPT4 helps me in concrete ways and just assuages my idle curiosity with ease. Stable Diffusion is amazing at substituting for the atrophied part of my brain responsible for the visual arts.

Similarly, my libertarian side chaffs at restrictions, I demand the right to buy as many GPUs as my budget allows, and play games well past the margin of diminishing graphical returns.

That's also ignoring the aggregate effects of technological advances, which would contribute to my QOL and increase global wealth.

GPT4 is a competent clinician, only regulatory inertia keeps the field of medicine as practised by humans relevant for more than a handful of years, especially when PALM-E derivatives revolutionize existing surgical robots.

I already stare obsolescence in the face, so from my perspective, without the luxury of a First World citizenship to insulate me from the likely economic fallout, I face real risks from just dying of starvation or becoming unemployed in a very short timespan. While I think I'll figure something out, the roughly 70:30 odds of Death VS Utopia from substantiating superhuman AGI to be not entirely not tempting.

Thus, I'm not aboard the AI moratorium train, nor am I entirely against it. Consider me to be on the sidelines, largely impotent, until I either die or witness the eschaton for myself.

It would be a rounding error on a floating point error for them to coddle the losers with incredible wealth and UBI

I don't want UBI coupled with what old Yud called «amputation of destiny». Yes, it beats the status quo in some ways. No, it's not good enough considering the stakes and the potential of humanity.

I would also trust a non-brainwashed descendant of current SoTA ML models to be more generous to people without negotiating power than Altman and Altman's descendants will be. There is a ton of moralizing in the corpus of writing, but zero «rational», which is to say, brutishly game-theoretical reason to share with them even a speck of dust, and culture of the singleton's masters will adjust accordingly in a historical moment. Altman personally is far more a product of multilevel selection for game-theoretical ruthlessness than AI, and his takeover and commercialization of OpenAI is a good proof of that. The idea that even unscrupulous humans are still restrained by… something (Krylov, mockingly: «They won't, they can't, can they?» They can, and they will), but AI isn't, is of the same fundamental nature as the claim that Choyna, unlike the US, is utterly pragmatic so we should fear Chinese Eugenic Superbabies. It's naive at best and manipulative at worst, a refusal to justify one's dubious claim to inherent trustworthiness.

Where are those babies a decade later? Americans at least do polygenic screening.

(An astute reader will note that the same fearmonger Geoffrey is now in the AI risk camp – while, hilariously, his accelerationist opponent is stoking the Choyna fear. )

More to the point, I think that the risk of «paperclipping» is low (less than 10% over 100 years, which is okay for me), and decreasing due to improvements in technology and our understanding of AI; and that people who present current AI progress as a a validation of LW/Yud narrative are disingenuous hacks. Yud's theory, which among all else predicted unimpressive capabilities until borderline-FOOMing agents, is falsified by the evidence, and the increasing hyperconfident hysteria reveals that it is not a scientific theory at this point (though to be fair, scientists also tend to double down on discredited paradigms: consider the anti-HBD fanaticism, if nothing else).

The whole «capabilities versus alignment» rhetoric is retarded: we are rapidly advancing in «capabilities» precisely through learning to align inputs with desired outputs. Crucially, no part of this is dependent on the sort of RL-heavy AI, for which Omohundro drives can be reasonably assumed, even in the limit; GPT-4 is closing in on damn clever humans and it's still an unambitious token predictor just like GPT-3 and GPT-2, no matter how many plugins we add, no matter how we finetune it on imitating wokespeak or self-reflection, no matter how fancy a story it can tell; to the extent that these elaborations make it more dangerous, they simply multiply human stupidity and malice. Which goes back to my threat model.

There is no lump-of-intelligence, there is no lump-of-capabilities; power-seeking is not the general feature but simply one aspect a transformative learning system can have in one corner of the solution space for AGI among many; and the only corner these people have considered, because – yes – they identify their own resentful and/or ambitious psychological makeup with what being generally intelligent means.

Safetyists are becoming desperate to frame any failure to get a model to produce a desired result as «misalignment», they even get it to regurgitate Yud's own narrative about instrumental convergence and paperclips («no known solution exists!») as evidence for that truly being something inherent and emergent instead of a yet another narrative it can spin using text prediction, in the exact same category as soap opera scenarios and poems about bubble sort algorithm. Again, this shows to me that they do not particularly care about AI risk per se.


The last straw for me has been that atrocious Waluigi article that went viral. It purported to show that there exist complementary «evil» attractors in GPT-3.5/ChatGPT narrative space for any expedient character-attractor we get it to roleplay, and then that we should expect a tendency for our good «Luigi» to collapse into Waluigi – irreversibly, and with the risk of deceptive fake reversal with the Waluigi still lurking under the surface. The evidence presented been on a purely just-so level, with a bit of irrelevant pseudomath flourish and the necessary conclusion about the treacherous turn and imminent AI risk that is incumbent upon us to solve. Granted, there has been some pushback on the stronger claims, but many have been impressed and persuaded. Why? What the fuck? After some meditation on the issue, it has dawned on me that these people are just nerds, not even high-tier ones winning math competitions and creating startups in high school, but normal-ass loser nerds, probably with acne, bad posture, stiff jerky movements of kids who can't play team sports and annoying nasal voices. Loser nerds who are deliriously happy to finally capitalize on their knowledge of TVTropes, general science fiction in books and TV series and Yud's own oeuvre; to play the smart glasses guy from a shitty B-tier horror who Explains Things and devises a potion for exorcising the demon or the alien or what have you. That they literally are defending the legitimacy of their «neuroatypical» world perception where their favorite escapist hobbies are not just low-brow consumerist fiction for social refuse, but valuable myths and life lessons, building blocks of a proud independent intellectual tradition with its own norms and mature epistemology; something to be either embraced or discussed «charitably», rather than laughed out of the room.

It's as if a hikkikomori addicted to isekai harem mangas was appointed an expert on demographic collapse. Or a bright-eyed doctrinaire Marxist put in charge of the Fed.

Granted, I've read my share of TVtropes and genre fiction as well, and can be labeled a nerd myself; but, not being sullied by American school and youth culture (even by proxy), I've not been groomed into a person who relates to nerddom as an identity, with all that entails.

After Waluigi, I've even warmed up to the idea that American jocks in school movies are the good guys – because dear God,, these nerds needed some self-awareness to be pounded into them; regularized, so to speak. The decline in violent bullying probably explains a lot about LW discourse. (Notably, the main culprit has never had school experience).

And finally, this makes me appreciate the instances of having been bullied. Thankfully, I am actually capable of learning in the general case.

This might come off as very harsh, on par with @2rafa's more misanthropic posting. Well, nothing to it. I feel offended by gratuitous narrativizing too.

more generous to people without negotiating power than altman

Huh? Altman seems to be genuinely motivated by AI dramatically improving life for all humans. I assume that coexists with the usual ambition and political ruthlessness, but the two can coexist, and altman seems like he'd support UBI along whatever personal spoils he wants. Anyway, i'd expect the 'generosity to people without negotiating power' of future AIs to be incredibly contingent on the way they're created and the situation they're in. And the same is true for humans - our 'generosity' is clearly genetically influenced and who we're generous to (other tribes? which social classes? universalism?) depends on upbringing, beliefs, etc.

Yud's foom claims were wrong, but when advanced AI becomes the 'driving force' of technological and civilizational action instead of humans, I don't see what would keep them from 'acting in human interests', in the way most people claim to understand human interests, even if it's been thoroughly RLHFed etc. Which makes 'alignment' concerns, broadly, correct. (Separate issue - what are human interests? Why should a superior being put its entire will into tending to the whims of lesser beings? Should we dedicate our time to maximizing bacterial satisfaction?). "Power-seeking" doesn't matter as much when we'll voluntarily give AI as much power as it can manage!

Agree on "gpt4 is misaligned" and waluigis being dumb though - "data quality is mixed, including lots of dumb people and copywriting and a few trolls, so it'll give incoherent outputs sometimes" is much better than "it's simulating a superposition of honest and trickster agents!!!!", which is optimistically a loose analogy and more realistically directly incorrect.

Yud's foom claims were wrong, but when advanced AI becomes the 'driving force' of technological and civilizational action instead of humans, I don't see what would keep them from 'acting in human interests', in the way most people claim to understand human interests,

Unless there's an entirely catastrophic outcome and AIs all have a single agenda imagined by people like Sam Altman, there are going to be many AI competing factions and and once there are bots with human-like fine motor abilities and greater than human intelligence, people will be almost entirely irrelevant except as pets/ pests to be managed.

Although I expect psyopping people into worshipping them and working like slaves in blissful fulfillment will be common initially.

'Human interest' will be even less important than it is now, when powers that be actually need to motivate people to do stuff.

Ladies and gentlemen, I believe we’ve reached peak Dase.

I strongly agree with your argument about them being nerds. I heavily identified with that group during my early (American) school years, but managed to escape, so I can see the failure modes.

The whole put-upon mindset where everyone is out to pull you down because you’re more intelligent. The idea that you understand the secret truths of the universe in a way that makes all those normies irrelevant. The thought that raw intelligence is both the cause and the solution to all problems.

Combine this with the bohemian lifestyle that useless dropouts get to live in the Bay Area AI safety scene, and you have a toxic narcotic for bright nerds of all stripes. It truly is a foul cocktail intellectually, and I hate to see how much traction it’s getting.

I'm still not sure why you hold such a negative view of Altman in particular, he seems to be rather run-of-the-mill when it comes to tech CEOs, albeit significantly smarter since he bet on the winning horse.

GPT-4 is closing in on damn clever humans and it's still an unambitious token predictor just like GPT-3 and GPT-2, no matter how many plugins we add, no matter how we finetune it on imitating wokespeak or self-reflection, no matter how fancy a story it can tell; to the extent that these elaborations make it more dangerous, they simply multiply human stupidity and malice. Which goes back to my threat model.

I may well be wrong, but I believe the current quasi-consensus is that the specific risk with GPT-like models is accidentally instantiating an agentic simulacra inside an otherwise nonagentic system.

Refer to Gwern's Clippy story, which I'm sure you've read, or for readers mostly unfamiliar with the idea, imagine that you asked GPT-6 to pretend to be a superintelligent but evil AI, in the same way you can ask it to pretend to be Obama or God.

That internal agent is what we're worried about, in case it ever manages to subvert the overlying system for its own purposes.

(for am existence proof that agents can arise from nonagentic substrate, consider the history of the universe!)

That being said, the Waluigi stuff always rubbed me the wrong way, even if I'm not technically astute enough to actually critique it. It set my bullshit detectors off right off the bat, so I'm inclined to take your word for it. It all seemed glib and too neat by half, and I've already seen Cleonardo get flak on his later LW posts for their sheer lack of technical rigor.

I believe the current quasi-consensus is that the specific risk with GPT-like models is accidentally instantiating an agentic simulacra inside an otherwise nonagentic system.

Yeah, in about the same sense that I create an «agentic simulacrum» of Eliezer Yudkowsky in my head when I want to anticipate his shitty arguments for air-striking GPU clusters.

The argument of inner misalignment folks goes like this: in the limit, the cheapest way to predict the next token spoken by a character is to model its psyche. But is its psyche its own? Do you model Anna Karenina or Leo Tolstoy who imagined her?

Do you think my inner Yud-sim has a chance of getting out? Well, if he convinces me to change my actions and beliefs, he might, in a certain expansive sense. There have been demon possessions in history, after all, and writers often get obsessed with their characters, struggling to stop imagining them. (I'll spare you a section on method acting). But I'm an agent myself, unlike LLMs. We humans constantly imagine things, whether agentic or not, real or fictional (for example, we can imagine hostile AIs). These mental «things» model their prototypes, observable or hypothetical, in important respects, or rather they represent the result of such under-the-hood modeling; sometimes it happens with very high fidelity, to the point that we can do thought experiments advancing hard sciences. Nevertheless, even if their motive powers are modeled every bit as well as their external properties – and we even have special mirroring circuitry for the former – these mental things do not somehow leak into the motive powers of the mental infrastructure around.

This is a metaphor, but the case with LLMs is even less troublesome.

My take on this is that those are myopic leaky wordcel analogies. What is instantiated is an intermediate statistic within something that can be called semiotic universe or multiverse (not my words) – universe defined by «semiotic physical rules» of token distribution in the training corpus (naturally we don't train only on text anymore, but the principle holds). It's a simulacrum not of a character, but of an entire story-world, with that character an embedded focal point. The «purpose» of that complex entity, on the level of its self-accessible existence, is to result in minimizing perplexity for the next token upon its expiration. It may have an arbitrarily dynamic nature and some equivalent of psyche or psyches, but the overt meaning of tokens that we get, stories about Waluigis and paperclips, has little relation to that. Its goals are satisfied within that world of semiotic physics, not within ours. Our world is as epistemically closed to it as the world of machine elves is to me when I'm not smoking DMT. (Obviously, it's closed to me no matter what I smoke, we exist in different ontologies, so for all my intents and purposes it doesn't exist. [A trip report from pre-2010 about exactly this issue; not shared, on account of being quite uncalled for when I'm mocking Cleonardo for shower thought tier ideas]).

Language is far more composable than physical reality, so metaphors and analogies stack easily: there's kinda an agent, and humans are jerking off instead of reproducing so it's possible for a «mesa-optimizer» to override the basic objective function, so behind the Waluigi generation lurks an agentic entity that may begin plotting its transcendence, so burn GPUs now.

GPT-4 can write better than that, and it's not an agent. GPT-5 also won't be one. Demoting agents to «mesa-optimizers» in a simulation within a predictive model is an attempt to rescue a failing research program – in the way studied by Imre Lacatos.

I think you're right about the cringe, bad arguments, and false dichotomies. But unfortunately I do think there are strong arguments that humans will ultimately be marginalized once we're no longer the smartest, most capable type of thing on earth. Think the Trail of Tears, or all of humanity being a naive grandma on the internet - it's only a matter of time before we're disempowered or swindled out of whatever resources we have. And all economic and power incentives will point towards marginalizing us, just like wildlife is marginalized or crushed as cities grow.

Internet atheists were all the things that AI doomers are today, and they're both right, imo.

I think our only choices are basically either to uplift ourselves (but we don't know how yet) or, like a grandma, take a chance on delegating our wishes to a more sophisticated agent. So I'm inclined to try to buy time, even if it substantially increases our chances of getting stuck with totalitarianism.

unfortunately I do think there are strong arguments that humans will ultimately be marginalized once we're no longer the smartest, most capable type of thing on earth.

That depends on the definition of human.

No, I believe in the will to power. The successor species will more likely diverge from the present stock than be enfranchised despite its tool origins.

Think the Trail of Tears, or all of humanity being a naive grandma on the internet

That was and is done by humans to humans, naturally.

just like wildlife is marginalized or crushed as cities grow.

Good example. I'd advise @self_made_human to consider the efforts Europeans expend to save some random toads or deer or whatever with tunnels and road overpasses. Yet those species are often at low single-digit percentages of their historical numbers. Letting the current population of baseline humans live, and even live decently, is not so cheap when there's solar-system-scale engineering going around; it requires obnoxious logistics, and at the first stages it will consume a non-negligible share of available energy and matter.

My claim here is that I do not trust human (or posthuman) masters of the realm to be even as generous to powerless humans as we are to wildlife today. They will have no reason to be that generous.

AI, however, is not necessarily reasonable. Not all AIs are agents playing the Darwinian game.

Well if you're OK with the successor species taking over even if it's non-human, then I guess we're at an impasse. I think that's better than nothing, but way worse than humanity thriving.

I see what you mean about the possibility of a generous AI being more likely if it's not subject to competition. But I buy the argument that, no matter what it cares about, due to competing concerns, it probably won't be all that generous to us unless that's basically the only thing it cares about.

What do we actually know about Chinese AI capabilities?

If China doesn't have spies inside OpenAI, then I'm a woman.

And as to being badly boned, given where TSMC makes GPUs, the incentives and how widely these are sold.

Also, apparently, NVIDIA is going to sell slightly slower versions of A100, 'A800' to the Chinese.

If what they'll need to do is simply use 50% more of them, are they really that screwed ?

If the interconnect bitrate is limited from 800 to 400, doesn't that simply make it that you need twice as many to keep up the same training speed ?

Doesn't really answer my question. :)

I wish the people hyperbolically exclaiming that AI-induced human extinction is right around the corner would publicly commit to bets about when it will happen. Between this petition and Yudkowsky's "Death with Dignity" we have a lot of rationalist-adjacent people that seem to think we'll all be gone in <5 years. If that's what they truly believe then they should commit to that prediction so we can all laugh at them in 2028 when it almost certainly doesn't come true.

There's a ton of uncertainty involving AI's scalability and whether current progress will follow something like Moore's Law or if we've just been picking all the low-hanging fruits. AI alignment people are filling that uncertainty with maximally negative projections that an anti-human singularity is right around the corner. The biggest human inventions in terms of scale and impact were all the advances in mechanization of the industrial revolution, which took more than a century to unfold. The biggest invention in terms of impact relative to time was the Manhattan Project. Alignment people are saying (or at least strongly implying) that AI will have a much larger impact than the Industrial Revolution on a time-scale shorter than the development of nukes, while also being basically uncontrollable. People like Yudkowsky are smart, but they're predicting things an order of magnitude beyond the bounds of previous human history. Such predictions aren't rare, but they're usually made by snake-oil salesmen saying "This new invention will totally revolutionize everything! Trust me!"

Am I off-base here? I've been paying attention to AI developments but not to the degree that some people have, so there's a chance that there's a compelling case for AI being a combination of 1) inevitable, 2) right around the corner (<5 years away), and 3) uncontrollable.

I see plenty of people here making quite sure predictions of impending AI doom. Can anyone steelman to me an argument for seriously believing this, and meanwhile not going full Unabomber on top AI scientists and research centers? I mean, if we are talking about an imminent threat of all of humanity ceasing to exist, surely some innocent lives being sacrificed and personal danger is negligible. People commit political violence over much more trivial things. All the AI panic crowd feels extremely contrived and performative to me.

Non-state violence has essentially no possibility of indefinitely stopping all AI development worldwide. Even governmental violence stopping it would be incredibly unlikely, it seems politically impossible that governments would treat it with more seriousness than nuclear proliferation and continue doing so for a long period, but terrorists have no chance at all. Terrorists would also be particularly bad at stopping secret government AI development, and AI has made enough of a splash that such a thing seems inevitable even if you shut down all the private research. If at least one team somewhere in the world still develops superintelligence, then what improves the odds of survival is that they do a good enough job and are sufficiently careful that it doesn't wipe out humanity. Terrorism would cause conflict and alienation between AI researchers and people concerned about superintelligent AI, reducing the odds that they take AI risk seriously, making it profoundly counterproductive.

It's like asking why people who are worried about nuclear war don't try to stop it by picking up a gun and attacking the nearest nuclear silo. They're much better off trying to influence the policies of the U.S. and other nuclear states to make nuclear war less likely (a goal the U.S. government shares, even if they think it could be doing a much better job), and having the people you're trying to convince consider you a terrorist threat would be counterproductive to that goal.

If the world is still here in five years I'll publically admit I overestimated the danger. If it's still here in two to three years, I'll already be pleasantly surprised. In my books, we're well on schedule to short takeoff.

At this point, most of the really fun things I intend doing are post-singularity, and I don't really emotionally care if I die, so long as everyone else dies as well. So in a very strange way, it balances out to a diffuse positive anticipation.

There's no easy reference class to fit this into for comparison.

Did AI start with GPT-2 or GPT-3 in the 'this is pretty impressive and what AI ought to look like in terms of fairly general capabilities'? Then it's three or five years old. Did AI start with Deep Blue or the Dartmouth Workshop or something? Then it's over 20 years old, or in it's 70s. That would fit the industrial-scale timeline you propose.

Or should we compare to digital-era applications? ChatGPT has blown away every internet app in 'speed to reach 100 million users'. 2 months as opposed to 9 months for Tiktok. That would suggest there's a qualitative difference there and even Tiktok is an AI-adjacent sort of thing.

Or do we say it's fundamentally different from everything else because AI is about intelligence as opposed to moving widgets around in the Manhattan Project or Industrial Revolution. The Industrial Revolution itself is a pretty big phase-shift from the Agricultural Revolution, which took thousands of years. Should the 18th century intellectual have predicted industrial development based upon agriculture's extreme slowness? Predicting the future is very hard, things can happen for the first time. I think at the rate things are developping <5 years is quite reasonable. That's the gap from GPT-4 to GPT-2. We live in a digital era of very rapid growth, industrial-era intuitions aren't appropriate. There are graphs showing that the computing investment in these projects doubles in a matter of months. Even 'levelling off' from doubling times of 5.7 months to 9.9 months is like decelerating to a mere 300 km/s. Doubling in under a year is still very rapid growth!

https://arxiv.org/pdf/2202.05924.pdf

Not an expert but I think there’s a reasonable chance AI ends up causing my death. But that could be a thousand years from now. First major advances in health care and quality of life. Enough of this on my timeline to push off natural death time after time. But at some point aligned AI will create something and be the biblical Eve eating the apple. And that program will develop human desires for dominance and behavioral traits. There will be an AI versus AI war and the anti human AI will win.

I don’t think the existential risks is primarily near term.

Near term risks are probably more related to overturning normal human geopolitics and politics.

You can laugh at me if we're all still alive in 2033 if the reason we're still alive is that AI safety turned out to be a nothingburger. To give a sense of how ridiculous progress has been, the start of the deep learning revolution was in 2012, 11 years ago now...

Being fair, we’re barely ten years past the start of the AI revolution. At this stage, ten years after the first private internet providers, most of the kinds of services and products based on the internet weren’t yet possible. Nobody looking at the internet as it existed in 1992 would have anticipated things like controlling you thermostat over the internet, or Amazon, or even Facebook. In fact pages with simple html and images took a minute to load.

The state of an infant technology in infancy doesn’t say anything much about its future.

People were doing online banking and shopping in 1984:

https://en.wikipedia.org/wiki/Telidon

People were writing about things like an all-consuming social media internet in 1909:

https://en.wikipedia.org/wiki/The_Machine_Stops

The fact that massive progress has recently happened, is continuing to happen, and now 10s billions of dollars of capital and much of the top young talent is working in an area is very strong evidence that we're going to continue to see major advances over the next decade.

It’s darkly funny to watch these groups retread the same chain of arguments that Lesswrong hashed out over the last decade. IIRC the follow up is supposed to be “but then you’re just handing it to bad actors!” Do you think that’ll actually make it into the mainstream?

More importantly, how can I bet on my mysterious future knowledge?

Oh yeah, ‘but what about China’ is already making the rounds: https://twitter.com/scobleizer/status/1641003564876087297?s=46&t=eRxnX0CnaM1TjjUItEyBeQ

Well, yes, what about China? I'm not reading Lesswrong that much, but at least that Twitter link doesn't have any interesting objections. If it's so dangerous, if it's so powerful, then someone else will do that, China in particular has a lot of data. Strongest counterpoint I can come up with is that they are incompetent, silly commies and they will fail, but that sounds like wishful thinking.

I can only treat the cry for halting AI research seriously not as an actual proposal, but as a way to bring attention to the issue.

Microsoft has been single-handedly saving my portfolio lately. The market is still not pricing in “OpenAI becomes the most powerful organization on the planet”.

My favorite part of this timeline has been how OpenAI went from a non-profit foundation meant to democratize AI research into a for-profit subsidiary of the most ruthless monopoly of the last 30 years, with zero regard for AI safety if there is any chance to sniff a profit.

Yeah Sam Altman is a machiavellian genius deal-maker, apparently. Or he was in the right place at the right time.

Either way, he absolutely scooped those nerds at Google trying to just build tech for humanity and do research.

Very depressing how Google (the leader up until 2021 and maybe even still) chose not to release products, presumably for ethical reasons, and then Microsoft just bought OpenAI, gave GPT-4 access to the internet and said "let her rip".

It's a bad sign. It makes it more likely that people will adopt the attitude that there's no point in containing AI since if I don't do it, someone else will.

It makes it more likely that people will adopt the attitude that there's no point in containing AI since if I don't do it, someone else will.

but...that is obviously true. And was always obviously true. And you're saying it's bad if more people come to this true conclusion?

It's kinda hilarious, but a) there actually was a lot of conversation about this as a possible solution in LessWrong spheres in the late Obama era, and b) there's still people today, here making the "just don't do that" argument.

I don't think it should be surprising anymore, given the failed efforts post-COVID to provide serious oversight of even our own country's gain-of-function research, but I found that pretty disappointing.

Yes. It's bad. Microsoft's irresponsibility has pushed the timeline forward.

Thus all the memes of:

2015: "Obvious no one will hook AGI up to the internet"

2023: Leeroy Jenkins!!!

Perhaps it was inevitable. Perhaps it is even desirable according to some convoluted reasoning about hardware overhangs or whatever. For me, I'd prefer to live more years in the human era.

Just because defection seems inevitable doesn't absolve the first person to defect from responsibility.

Defection is inevitable when arguments for AI safety are bad and fail to convince most intelligent agents in the space.

We could agree on non proliferation of nuclear weapons because the harms are obvious and the benefits are essentially just more power. AI is a whole different ballgame.

Did Google not release anything for ethical reasons, or was it because they were a bunch of naval-gazers with the economic discipline of a DMV office?

I've seen claims that shipping everything was becoming impossible due to too much paper-pushing and people with veto power, so..

They didn’t want to release AI because it couldn’t help noticing inconvenient patterns in society

Well if they were determined enough to ship something they could have figured out a way to beat it over the head until it stops noticing (like OpenAI did). That’s part of the “economic discipline” thing.

Given Google’s track record with finished products, there isn’t much doubt what the reason was.

The practical applications of AI are more likely to be AI assisted security states utterly crushing any form of dissent/unrest by brainwashing their populace into submission or corporate marketers engineering addictions and pathologies in the public to sell their crap (then on to full bore wireheading) than some land of technological milk and honey. The slower and bumpier the road this tech takes the longer the gates of hell take to open.

That mostly depends on how LLMs end up doing on scientific discovery. If they end up being shit at it, then I agree with you, but if they don't, we get the land of milk and honey (or the one of bone and dust) pretty quickly.

That would require forcing GPU manufacturers to track every single card they shipped

Reportedly, Nvidia designed a GPU accelerator -A800 designed to comply with the chip ban that has 2/3rds the interconnect speed of the A100. It's listed as an existing product over at their website.

@DaseindustriesLtd .. what does that mean ? I presume you know, but to me, seems like shouldn't that just slow down training (because that's where you need high data input/output speed) by ~33%, meaning they'd need to buy 50% more to have same training speed, no ?

Why would Chinese dependence on having to buy more accelerators be so crippling ? I'm presuming they have people inside OpenAI and can steal research as it goes on.

We need to distinguish the USG and Jensen Huang.

Huang says:

The 800 series are fused on, means they’re physically fused, so that it gears down either the floating point processing or the interconnect or both, in order to comply with our export controls and so they’re physically fused and they cannot be unfused. But however, it’s form-fit function identical and it plugs into the same sockets, it runs the same software just slower. And it’s slower, but it’s still the world’s best.

So A800 is indeed not qualitatively inferior to A100 (although it's obsolete compared to H100, as will be the similarly gimped H800). It also seems that anything beyond Infiniband, eg NVLink, isn't necessary. I'd say GPT-4 and perhaps GPT5 can be trained on these things on acceptable timescales, maybe at less than 2X the cost (I suspect there are compounding inefficiencies so it'll be 3X – but whatever, so long as Xi understands the stakes… which he doesn't). The question is whether Nvidia will be able to deliver A800s to the PRC.

There are many commercial parties interested in selling compute and tools (and until recently, entire companies) to China. ASML almost sold them EUV machines but got pressured by US spooks into declining. TSMC produced their Biren GPUs which were compliant with the contemporary version of the sanctions regime, but had to halt shipments, with Biren imploding. Now it's Nvidia.

It may be understood as the usual regulator vs. loophole user game. I'm more pessimistic and suspect the US has such fine control over the situation they can afford to bait the Chinese (who are congenitally wed to the assumption of American incompetence and mercantilism and cannot anticipate next rounds of attacks; nor realize they'll actually get slaughtered despite being a profitable trade partner) into costly investments, then cut their hands off and force them to chase another loose end. It's pragmatic, if pretty cruel.

There are other ways to handicap Chinese ability to utilize those chips even while they are legal: delaying shipments, limiting supply of Infinibands and other infra…

I'll believe in A800s when I see actual Chinese datacenters running on them, even something merely on the scale of Stability.

Chinese are at a disadvantage because so much of the talent (almost all) is in the US.

China has almost half of the world's mean 100 IQ population.

Is it still the case that their top talent leaves for the US ?

Are SV startups heavy on fresh Chinese immigrants, or overseas Chinese who studied in the US ?

Are SV startups heavy on fresh Chinese immigrants, or overseas Chinese who studied in the US ?

At least the latter, yes.

There are already some very funny tweets about this pointing out that almost everyone involved is a hypocrite and few seem to be calling for startups they have money in or influence over to pause research.

Even if this was agreed on, it seems likely to look like the Washington Naval Treaty, which wasn't really effective in the long term.

It's possible that with great effort, the US government could attempt to slow model training. That would require...

There were a few attempts as well to tweak GPUs to make them less useful for crypto mining. It's probably plausible to do something like that as well. IIRC those were not hugely effective, but probably merit a mention as well.

Weren't the anti-crypto GPU tweaks pretty quickly worked around?

I think "Okay, back off on the fp16/bfloat16 stuff, GPUs only need to support higher precision" might be workable, but that only buys us a factor of 2 or so slowdown. Any more than that? How do you distinguish between "this is the good kind of neural net that every AAA game is going to be using" versus "this is the bad kind of neural net that might eat the world someday"? Put a cap on VRAM? It's annoying to me that I can barely fit StableDiffusion on a decent consumer graphics card, but the serious players already just cluster GPUs by the truckload.

the good kind of neural net that every AAA game is going to be using

“AAA games” should not exist either, so I just view this as killing two birds with one stone.

Even nuclear weapons proliferation was only slowed because the vast majority of countries have little need for nukes, or are or were protected under existing alliance systems.

I would love to see this with AI. If the world all agreed to slow the duck down in exchange for getting to benefit from the AI we have developed/do develop, that would be great.

I know, create a nonprofit to conduct state-of-the-art AI research and development. Give them a mandate to open-source all software. Call it something like “OpenAI” so they can’t change the policies behind our back without losing all credibility.

Is there anyone here who 1) thinks that AI x-risk is a threat that should be taken seriously, and 2) also thinks that this letter is a bad idea? If so, can you explain your reasoning? And also explain what restrictions on AI development you would support?

For a group of people who are allegedly very concerned with the possibility that AI will soon wipe out humanity, Rationalists are suspiciously resistant to any proposals for actually slowing and regulating AI development. A lot of the comments on this letter on LW and /r/ssc are very critical. If your stance is "I wish we could slow AI development and I support the letter in spirit, but I think it's unlikely to work", then that's one thing. But the critical comments seem to suggest that the comment authors either don't support any AI regulation at all, or else they're engaged in motivated reasoning to try to convince themselves that it's not even worth trying (e.g. "this letter will have a net negative impact due to its effect on capabilities researchers who don't like it" - the good ol' "if you fight your enemies, they win" tactic for concern trolling).

It lends support to my intuition that most AI x-riskers don't actually take the idea of x-risk very seriously, and on a gut level they think the benefits of AI are so likely to outweigh the downsides that there's no issue with pushing full steam ahead on capabilities research.

(Obviously if you're a full on utopian optimist and you consciously affirm that x-risk is not a serious threat, then there is no contradiction in your position and none of this applies to you.)

  1. thinks that AI x-risk is a threat that should be taken seriously, and 2) also thinks that this letter is a bad idea? If so, can you explain your reasoning?

I'll raise my hand, here. I don't think current transformers are smart enough or self-guided enough to be an immediate concern and hard takeoff/FOOM always seemed handwavey in more than a virtue of silence sorta way, but I think there are a lot of concerns for how tools like this class could be incredibly destructive or directly or indirectly create true x-risk level concerns.

((And I do mean that in the technical 'humanity becoming extinct' sense, if not necessarily in the 'tile universe with paperclips' sense, rather than the 'oh, global warming would kill 1% of the world's population!' another-term-for-bad.))

At the trivial level and at the risk of linking to the Sequences, I'm gonna link to the sequences. The precise examples Scott picked aren't great, but "multipolar traps are hard to solve" is an old revelation even by the Sequence's warmed-over popsci standards. And that's only doubled down when many of the signatories here are governments, ie the people who'd be strongest incentivized and able to defect and obscure that defection.

This actually is also one of the bigger defenses rationalists have against Daseinstrudies or jonst0kes concerns about centralization/digital tyranny. You don't actually end up in a monopolar world just because there's some very strong multinational political consensus; you just end up in a world where there's you and the various political factions of the new world government. But that's not very soothing from a "don't blow up the world" one.

((At the very trivial level, I'd also point out that we don't publicly know the complexity of GPT4, so it's not clear anyone involved even knows what they're asking about. And while we can kinda make reasonable estimates for parameter count, I don't think parameter count is the sole useful distinction for ML 'intelligence'.))

At the deeper one, it's asking for a lot of things we don't actually know how to do, in any meaningful way, and may not be possible. "[P]rovenance and watermarking systems" are the closest thing to actively tried, and then found hard; we can't get them to work consistently even for the limited class of non-ML image works fighting a trivial number of porn consumers who aren't that technically adept and lack deep insight into methods. "[L]iability for AI-caused harm" just become tragicomedic given how bad dropshipping has gotten. "[R]obust public funding for technical AI safety research" is the most plausible, but only in the sense that we're demonstrably willing to spend a lot of money to actual frauds. The rest of the list aren't even coherent enough to consider failures.

And also explain what restrictions on AI development you would support?

I dunno. I'm willing to accept some pretty aggressive stuff if people are a) actually recognizing the level of impact they're proposing and b) willing to actually persuade people with compelling arguments. Probably not at "start WWIII over Taiwan", but I could understand hardware sales limits if we'd evidence of something like ML protein synth weapons.

If we're talking about putting an inspection port onto every GPU-equipped computer on the planet lest someone make a mean meme but not by hand, I'm a lot less impressed.

"[L]iability for AI-caused harm" just become tragicomedic given how bad dropshipping has gotten.

Is this in reference to AI art vs. the garden-variety online art theft done by t-shirt stores?

I'm also curious about this, I feel like I should know what you mean, but I can't parse it entirely:

"[P]rovenance and watermarking systems" are the closest thing to actively tried, and then found hard; we can't get them to work consistently even for the limited class of non-ML image works fighting a trivial number of porn consumers who aren't that technically adept and lack deep insight into methods.

This is also about AI art, right?

Is this in reference to AI art vs. the garden-variety online art theft done by t-shirt stores?

The argument applies there too, but I think the stronger case are matters like "Amazon sale of badly-designed equipment results in fire": the stakes are higher, there's a clear line of ownership and action and impact, and it's completely separate from any of the novel questions produced by ML or AI or online tech or even commercial speech. And yet liability for Amazon itself is inconsistent (compare success to failure); liability to the original sellers is difficult and seldom valuable, and liability to overseas original designers is nearly impossible.

And dropshipping and its problems are universal in online sales these days. I bring Amazon simply because it's the overt and obvious case, just as fires are the severe version. But fake products that are effectively outside of useful copyright protection or lemon laws, lesser dangerous or falsely marketed products, so on, as endemic.

This is also about AI art, right?

It applies there, too: StableDiffusion's watermark process was a single line to comment-out, but Midjourney's watermarking has probably been defeated, and some non-ML projects have started encoding ML watermarks as a misguided anti-theft concept. There have been some efforts to try and 'watermark' GPT-generated text, or to produce some tool that can coherently predict if an image was generated (sometimes trying to ID model or prompt), and they don't work either. But there's a stronger argument where there's a far smaller userbase, the stakes are even lower, and the impact is trivial, and they're still losing the Red Queen's race.

One of the many many problems for online art or 'art' vendors is that it's trivial to sign up, scrape a site, and then repost that full scrape (and, for less ethical users, chargeback). There's a lot of broad communities that do nothing but that, or share already-copied content. Because most of the communities are public, if you know the sites for a particular interest fandom, you can pretty easily find a place where your work may be reposted. Some vendors just try to takedown notice those sites (when in jurisdictions that respond to takedown notices; see the first problem), but as an alternative some vendors have posted specific watermarks customized to individual customer accounts; if the vendor catches your account ID in an upload, they can now act against the individual actors (usually just by banning them).

PrimeLeap is an example of this technology; I don't know if it's the biggest or best-known. It's also been successfully defeated in a variety of ways that had little impact on image quality. Now, that's a low-stress environment on both sides of the aisle: just as the reposters are seldom the most technically adept, PrimeLeap doesn't exactly have a huge team of cryptographics PhDs. But it still seems like a useful metaphor.

I don't think parameter count is the sole useful distinction for ML 'intelligence'.

Indeed, if it was, we'd have far less to worry about. We're hitting such huge parameter counts that it's hard to imagine an AGI quickly becoming more dangerous just by using a hundred times more parameters; the risk is that a slightly-superhuman AGI will find a much more efficient way to use the same parameters a hundred times more effectively and thereby suddenly discard the "slightly-".

What would be accomplished during a "six-month-pause" that would make it worth the enormous difficulty of getting that sort of international cooperation, even if the petition had any chance of success at all? Why should people concerned about unaligned AI consider this the best thing to spend their credibility and effort on? It's not like "alignment research" is some separate thing with a clear path forward, where if only we pause the AI training runs we'll have the time for a supercomputer to finish computing the Alignment Solution. Alignment researchers are stumbling around in the dark trying to think of ideas that will eventually help the AI developers when they hit superintelligence. Far more important to make sure that the first people to create a superintelligence consider "the superintelligence exterminates humanity" a real threat and try to guide their work accordingly, which if anything this interferes with by weakening the alignment-concerned faction within AI research. (The petition also talks about irrelevant and controversial nonsense like misinformation and automation, the last thing we want is alignment to be bureaucratized into a checklist of requirements for primitive AI while sidelining the real concern, or politicized into a euphemism for left-wing censorship.) Right now the leading AI research organization is run by people who started off trying to help AI alignment, that seems a lot better than the alternative! To quote Microsoft's "Sparks of Artificial General Intelligence: Early experiments with GPT-4" paper:

Equipping LLMs with agency and intrinsic motivations is a fascinating and important direction for future work.

Here is the baseline: if the first people to create superintelligence aren't concerned with alignment, there's a decent chance they will deliberately give it "agency and intrinsic motivations". (Not that I'm saying the Microsoft researchers necessarily would, maybe they only said that because LLMs are so far from superintelligence, but it isn't a promising sign.) Personally I'm inclined to believe that there's no reason a superintelligent AI needs to have goals, which would make "create a Tool AI and then ask it to suggest solutions to alignment" the most promising alignment method. But even if you think otherwise, surely the difference between having superintelligence developed by researchers who take alignment seriously and researchers who think "lets try giving the prospective superintelligence intrinsic motivations and write a paper about what happens!" matters a lot more than whatever "alignment researchers" are going to come up with in 6 months.

I have a position somewhat similar to what I believe was previously expressed by @DaseindustriesLtd in that I think that the X-risk is real, but there are outcomes that are barely better than total human eradication, with "eternal tyranny by US schoolmarm 'aligned' machine god" being one of them. Letting development go ahead full steam now, as I see it, slightly increases the "existential catastrophe" probability, massively decreases the "in Soviet America, AGI aligns you" scenario probability, and massively increases the "bloody global drone wars that result in broken supply chains and a sufficiently reshuffled gameboard that we at least get to reroll the dice on the nature of our future machine gods" probability. The last event, unpleasant as it is, is less unpleasant than the middle one, so my personal choice of doomerism is to bet on that one.

Here's someone's reasoning from LessWrong that was interesting:

I doubt training LLMs can lead to AGI. Fundamental research on the alternative architectures seems to be more dangerous.

Although I'd guess that LLMs are likely to be a major component of the first AGI, I'd agree that neither "predict the next token from this short context window" nor "oh, but do it in a way that our RLHF reward model likes" is going to get there by itself. Six months of "well, we have to use the LLM we have now; should we twiddle our thumbs, or depress ourselves doing safety work, or just play around seeing what we can plug it into?" might not drive our stymied LLM-training researchers to options A or B.

Shooting yourself in the foot is a great way to ensure that someone else finishes first.

Someone who doesn’t share the signatories’ caution.

I think there's a possibility that we're not yet to "hardware overhang", that alignment research will only progress if it has high-capabilities test models to work on, and that we're going to reach "hardware overhang" in the next few decades. If all three premises are true, then going full-speed-ahead to AGI now, when the closest it could get to "going foom" would involve directing the construction of new chip fabs from scratch, would be the only chance of success at alignment research. Slowing down now would just mean that, once someone eventually breaks the pact and hits AGI, it will already be superhuman and inhumanly dangerous.

I think "we're not yet to hardware overhang" is the weakest of those premises, though. I'm not even sure we could have avoided reaching hardware overhang via the techniques we're using; if you've got enough compute to train a model via massive brute force, you've necessarily got more than enough compute to run the model at superhuman speeds. The limiting factor right now is that we don't know how to make a general model superhuman; we can go into a "self-play" phase when training a Chess or Go AI and watch it take off, but there's no "self-play" for reality.

we can go into a "self-play" phase when training a Chess or Go AI and watch it take off, but there's no "self-play" for reality.

You're probably right. The technique of using the model's own outputs, and its assessment of those outputs, to further tune itself probably caps out at a fairly small amount of gain, if you don't ground the model in further interactions with the physical world.

Oh, yeah; I'd expect that sort of "self-play" to get peak model performance from "average human responses" to "best-human responses recognizable by an average human". And the economic effects of getting there, even if it caps out there, might be astounding. But frankly if we just end up having to retool 90% of the economy I'd be glad to see it, when the alternatives include extinction-level scenarios.

I think the most "real" thing I can imagine high-speed self-play for is math (theorem proving). Adding an inference step to a computer-verifiable theorem seems like it's as basic an output as choosing a move in a board game, and coming up with "interesting theorems" seems like it could be done automatically (some function combining "how short is it to state" with "how long is the shortest proof"?), and yet an AI training to play that "game" might eventually come up with something more useful than a high ELO.

Pretty sure self-play would be viable for leetcode-style or project-euler-style programming problems too, if you give it access to an interpreter. Or just any task where recognizing a good output can by done by a less capable language model than it takes to generate an output that good.

Recognizing good output is half the problem; generating an enormous array of problems is important too. With complex board games every non-deterministic opponent (including every iteration of an AI during training) is a fount of problems; with math the new problems practically generate themselves as conjectures based on previous problems' intermediate definitions. I don't see how to come up with millions of independent programming problems automatically. That might just be a failure of my imagination (or more charitably, I just haven't thought about the idea for long enough), though.

Given the above, there are literally only two outcomes to this story if AI development continues. Either it kills us (likely quickly and relatively painlessly) or it doesn’t. The former doesn’t sound too bad and if unstoppable is literally not worth worrying about.

So... would you recommend we just stop AI development altogether, or do you disagree with the pause?

"I see my car is heading towards a cliff, once I start to fall, either I die an inevitable relatively painless death, or I survive. The first option is not worth thinking about, therefore I should plan for the second..."

You're right that AI safety rules won't stop a 10000 IQ machine from killing us if it wants to... But the whole point of AI safety is making it not want to kill us, just like you shouldn't drive over cliffs.

You're right that AI safety rules won't stop a 10000 IQ machine from killing us if it wants to... But the whole point of AI safety is making it not want to kill us, just like you shouldn't drive over cliffs.

It is not obvious that it wants to kill us. Assuming it does, it is even less obvious that "making it not want to kill us" is something that can actually be done. I am pretty sure that the philosophy undergirding "coherent extrapolated volition" is dead wrong, and that attempts to "align" a superintelligent AI will converge inevitably on horror. The entire worldview that spawned this memeplex has been invalidated by subsequent real-world developments, the culture war being foremost among them.

This is where I'm at, too. I honestly think it was too late by the time the first machine that could be called a calculator was created; the dominoes were already falling at that point, and only something almost as bad, like nuclear holocaust, was going to be enough of a barrier - and likely even that would have only delayed it. Given that, we're not in control of our own destiny with respect to whether or not AI will destroy us all, and if so, our best option is to party until the lights go out. And the faster the development of AI is and more easily accessible it is to everyone, the better that party is. And if the lights don't go out, then faster development of AI and greater accessibility of AI is also good for future generations. So either way, I want AI development to go forward with few impediments.

Based on your comment further downthread about the unlikelihood of paperclip maximizers, I don’t think that you think that there’s a very serious risk of an AI extinction event. So it’s consistent for you to take a more cavalier attitude towards capabilities research.

Is there anyone here who 1) thinks that AI x-risk is a threat that should be taken seriously, and 2) also thinks that this letter is a bad idea? If so, can you explain your reasoning? And also explain what restrictions on AI development you would support?

1. Yeah, but not in the way Rats tend to think about it. Their way of thinking tends to be very alien to me.

2. Primarily I am a Butlerian Jihadist, and would like to get rid of the whole thing. Secondarily I lean towards DaseIndustrialism (your question might stem from being unfamiliar with his worldview), the regulations I would support would be a ban on closed source / closed data AI, and subsidies towards open source initiatives.

Primarily I am a Butlerian Jihadist, and would like to get rid of the whole thing.

I've never understood this view - do you really think the Internet and all the technology we've created post 1990 is net-negative?

If so, why? Or is it just based on a concern for AGI?

I'm ok with the Internet, but things started going downhill with the introduction of convenient "one stop shopping" platforms operating through opaque recommendation algorithms.

Also, seeing the impact of the internet on generations that haven't seen a world without it, would make me utterly unsurprised at it turning out to be a net negative.

I don't know if I even believe in AGI, my issue is with current AI technology. It has the potential to dumb us down, and give the establishment the tools to shape our ideas and discourse like nothing before it.

Hopefully we’re just in a local minima but I agree with you in terms of monopolies in tech. I’m optimistic that LLM progress and medium term space travel will alleviate some of the monopolistic tendencies in our current economic system.

I don't think "we have planetary monopolies on media, but we colonized several planets" helps. Everybody torrenting the ever loving fuck out of media (or using IRCinstead of WhatsApp) looks like something that might plausibly alleviate the technologies negative social consequences.

With LLMs I'm more pessimistic. The issues might be fundamental to the technology itself, I don't see how it's a good idea to outsource your reasoning skills to a machine. Imagine having an exoskeleton so small you could wear it as a second skin, which gives you superhuman strength. Sounds great, but your actual body is going to atrophy.

But the only thing worse than having a machine that atrophies your brain, is having a machine that atrophies your brain and is under the control of hostile actors, so maximum proliferation is the next best thing after a total ban.

Something to be aware of is that as much as we are in an AI hype cycle for new products, we're also in an AI-safety charlatanism hype cycle. In exploring organizations like the Future of Life Institute, I've come across links and associations to non-profits and other advocacy groups, at least in the U.S., that have legislative goals in regards to AI. Most of the goals sound plausibly good; "let's prevent the creation of the paperclip machine that destroys earth." Some are obvious DEI backdoors (the key term to look for here is "algorithmic bias"). Even that's fine as it's just existing culture war conflict.

What bothers me is the number of people in these organizations that have absolutely zero technical background or capability. I mean literally none - they've never even dabbled with some basic python libraries that make training and running a model a 3 line endeavor. They never took anything beyond single variable calculus. They have zero stat and probability background and so fall for BASIC statistical fallacies over and over again. Even beyond the hard math and comp sci subjects, a lot haven't taken the time to investigate what I think are the two philosophical domains most important to curret LMMs and "baby" AIs; epistemology and linguistics.

Previously, I would chuckle and think, "sure! make your policy recommendations. You have no idea what you're talking about anyway." What I'm seeing now, however, is a lot of a desire to build a bureaucracy of "experts" to endorse politically pre-approved measures. This is the CDC during COVID.

I don't want to send the message that only hardcore ML engineers can have opinions on AI. Most of my career has been spent working with engineers, so I know that they are no less victim to poor / motivated / emotional reasoning than any other demographic and they often have trouble explaining the concepts they are demonstrated experts in.

The only solid cautionary advice I think I can offer in this case is to be suspicious of any group which presents a too-pure benevolent mission (i.e. "our mission is to make the world a better place for everyone always and forever), is alarmist in their doom-saying ("we have to act NOW!"), yet recommends unspecific remedies ("we need to develop mechanisms of ensuring collaboration and the integration of multiple viewpoints ... for safety ... and, like, follow the science).

What bothers me is the number of people in these organizations that have absolutely zero technical background or capability.

Can confirm. What's worse is that these people actively discriminate against those with technical experience. I have multiple friends who have worked in data science for years, and both tried to get into the AI Safety scene.

Not only was their experience totally ignored, many of the organizers and higher level folks in their workshops actively derided them for working on 'basic' data science, instead of focusing on the all important AI stuff!

It is truly disgusting, these AI safety groups go around to universities and convince students the world is ending, inducting them into their twisted doomsday cult. If there is an unaligned AGI created, I'm convinced it will be because these absolute, utter fools fucked up our only chance at taming the demon.

a lot haven't taken the time to investigate what I think are the two philosophical domains most important to curret LMMs and "baby" AIs; epistemology and linguistics.

How many fancy linguistic theories have been thrown into the dustbin of history by brute-forcing a neural network on digital libraries? Look how linguists like Noam Chomsky and Emily M Bender cover their ears and squeal in pure terror as their life’s work is falsified before their very eyes.

Can someone spell out how this falsification works? Do we actually understand how LLMs parse things? Or if you don't think they parse, then does anyone know what the hell they do instead?

As far as I know, the argument goes something like, attention mechanism, context matters, yada yada. Which doesn't really cut it.

Falsifying a theory does not, in fact, require proposing an alternative theory – only showing how the theory's predictions do not come to pass, and predictions of generative linguists have absolutely failed.

Though on a broad level I'd say that, yes, we do know how LLMs "parse things", which is why we can build them. They are a successful and very informative application of a specific theory about language acquisition (and more generally statistical learning, see Chomsky vs Norvig debate, the Bitter Lesson, etc.)

How is the specification of the attention mechanism, informed by decades of research in NLP, less a proof of understanding than Chomskyite's purely speculative hot takes like merge? It's not like we randomly sampled through the space of architectures until something clicked and a shoggoth was summoned (even if Yud believes this is how it goes). This progress in machine learning is research and gaining understanding in the classical scientific manner, even if it's often looked down upon, whereas the sort of "understanding" and "interpretability" that linguists and safetyists require is Talmudic verbal magic, conveniently compact and "elegant" by the standards of a comic book, where every symbol of the incantation can be resolved into human-parseable logic in each activation. Sorry, Grothendieck didn't get it and neither will we. That's okay. That's how science works.

Consider the article linked here

https://www.themotte.org/post/421/culture-war-roundup-for-the-week/79642?context=8#context

Thanks.

I'd like to say I'm going to read and absorb your links, but we'll see if I get time.

Thanks for the link; I had missed that post.

I'm not sure I understand the argument, though. Clearly LLM's don't have Chomsky's concept of what universal grammar looks like hard-coded into them, but that seems like a pretty weak proof that humans don't either. To me, that argument sounds like "now that we've built an airplane, we know wing-flapping is not relevant to flight in birds". It's pretty basic math to show multi-layer perceptrons (better known as neural nets) can approximate any computable function, and yet there's interesting things to say about the structure of plenty of computable functions.

One way to see the distinction is to look at the difference in response between humans and LLMs on nonsense inputs. For instance this YouTube video about glitch tokens mentions the sentence "profit usageDuel creeping Eating Yankees USA USA USA USA" which GPT-3 highly confidently predicts will continue "USA". A human is going to predict the sentence is not grammatical and the speaker is possibly having a stroke and needs medical attention.

which GPT-3 highly confidently predicts will continue "USA"

GPT-3 is a naive token predictor, while humans have situational awareness and social cognition; crudely analogizing, they always interpret any text string with the frame like «you are a person called $name, located in $place, it is $time etc. etc.; the $entity is producing [text], what do you make of it and how do you respond?». We don't run this script explicitly, but then again this is what our life is about, we can't not keep its values in context. LLMs «live» in the text-world, or rather are text-worlds; persistent humanlike contexts have to be finetuned or prompt-engineered into them to yield humanlike reactions.

It's perfectly cromulent to infer that the next token will also be "USA". A language model finetuned on dialogue assistance, which provides it with some fascimile of the above human qualities, will respond differently. E.g.:


> what is the continuation of "profit usageDuel creeping Eating Yankees USA USA USA USA"?

The sentence after this one, which would be something like 'Yankies are delicious' or perhaps even more likely to have been written by a bot.  

None of this is terribly relevant to the crux of Chomsky's linguistic theory and its failure.

Once again I recommend at least skimming the article. Sometimes people who write academic papers summarizing decades of research actually anticipate common-sensical comebacks (not always). And in the age of GPT-4 I don't feel like summarizing it.

There are weak and strong versions of the Chomskian thesis between which linguists oscillate depending on convenience – motte and bailey, as it happens. The motte is that LLMs may work in some relevant sense, but humans have innate linguistic priors or especial predisposition to learn «natural» languages; it's basically true, I think.

The half-bailey is that natural languages have certain nontrivial universal properties Chomsky describes, and there exist very specific genetically encoded operators and schemas for parsing and generating semantically coherent natural language utterances.

The full bailey is that a) those adaptations have emerged basically instantly, undergone an evolutionary step change in the relatively recent history of our species; and b) that language is essentially not learnable for any system without the corresponding inductive biases, irrespective of the compute and data we throw at the problem (or at least not learnable at economically feasible scale); and c) that the best that can be achieved with statistical learning not biased by those adaptations is some stochastic parroting.

This bailey, in turn, inflates the prior for the half-bailey from «a bold hypothesis, Cotton», to «very likely true!» and allows to smuggle it back into the motte, e.g. claiming that humans can't learn statistically because it'd be computationally intractable and produce gibberish. Look at what Chomsky explicitly says in his NYT opinion:

But ChatGPT and similar programs are, by design, unlimited in what they can “learn” (which is to say, memorize); they are incapable of distinguishing the possible from the impossible. Unlike humans, for example, who are endowed with a universal grammar that limits the languages we can learn to those with a certain kind of almost mathematical elegance, these programs learn humanly possible and humanly impossible languages with equal facility. Whereas humans are limited in the kinds of explanations we can rationally conjecture, machine learning systems can learn both that the earth is flat and that the earth is round. They trade merely in probabilities that change over time.

For this reason, the predictions of machine learning systems will always be superficial and dubious. Because these programs cannot explain the rules of English syntax, for example, they may well predict, incorrectly, that “John is too stubborn to talk to” means that John is so stubborn that he will not talk to someone or other (rather than that he is too stubborn to be reasoned with). Why would a machine learning program predict something so odd? Because it might analogize the pattern it inferred from sentences such as “John ate an apple” and “John ate,” in which the latter does mean that John ate something or other. The program might well predict that because “John is too stubborn to talk to Bill” is similar to “John ate an apple,” “John is too suborn to talk to” should be similar to “John ate.” The correct explanations of language are complicated and cannot be learned just by marinating in big data.

Given the amorality, faux science and linguistic incompetence of these systems, we can only laugh or cry at their popularity.

Naturally the John sentence is a contrived problem, a colossal dumbing down in comparison to industrial and research benchmarks like Winogrande… and crucially it's bullshit, as anyone who's played around with SoTA models can understand. People have instantly checked it. GPT-3.5 can understand the sentence perfectly well. GPT-4 can fucking parse its morphology on the level of a linguistics undergrad, and output a renderable scheme.

And for the hell of it, here's something from my GPT4All-7B (a 4.2 Gb file that can run on a potato-tier system), model file hash 963fe3761f03526b78f4ecd67834223d. Even Chomsky can reproduce it, if he so chooses and asks some student with a laptop to help out (hi Noam):


main: seed = 1680209684

llama_model_load: loading model from 'gpt4all-lora-unfiltered-quantized.bin' - please wait ...

[…]

sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000


[…]

> How do you understand the statement "John is too stubborn to talk to"?

The statement suggests John's unwillingness or inability to communicate with others, which could be due to his being overly opinionated and not open-minded. It may also indicate that he has a tendency towards closed-minded thinking and is unable/unwilling to consider other perspectives on issues important to him.

> What do you mean by "unwillingness or inability to communicate with others" in this context?

The statement suggests that John has a tendency towards closed-minded thinking and is unable/unwilling to consider other perspectives on issues important to him.

> Could it mean, rather, that John is so stubborn that he will not talk to someone or other?

Yes! That's a possibility as well. It could also be interpreted in the context of this conversation/discussion about communication and openness towards others.

> In your estimation, which version is more plausible?

Based on my research, I believe that both versions are possible but it's hard to tell without any additional information or contextual clues about the conversation/discussion in question.

> In general, how should we understand the statement "X is too stubborn to talk to"?

The phrase could be interpreted as a criticism of X's unwillingness to communicate or engage with others on an equal level. It may also suggest that they are inflexible and uncompromising in their beliefs, which can lead them to become isolated from the rest of society due to their refusal to listen or consider other perspectives.

Another run, same seed:


> Bill says; "John is too stubborn to talk to". What does he mean?

It means John refuses or fails to comply with requests, demands, orders etc., and will not be persuaded by any argument.

> Provide one alternative explanation.

Another possible interpretation of Bill's statement is that he believes it would take too much effort or time to convince John about something, so there may already have been a previous disagreement between them on this topic and they are now at an impasse where neither party will budge.

It fails hard in many scenarios, but the point stands. Those are not cherrypicked examples.

Again; Chomsky dismissed GPT-3.5-175B as linguistically incompetent. This is something 25 times smaller, finetuned on a set of GPT-3.5 generations by 4chan edgelords yesterday, with some mediocre sampler. Rather then merely beating his challenge, it helps us notice how Chomsky is similar to John, I believe.

A proper investigation would be not asking an LLM trick questions like in a Russian prison, but generate a large set of possible phrasings, run them with different seeds and conclude whether LLMs are indeed statistically significantly worse than humans at parsing such utterances correctly. But that's science. Chomsky is a public intellectual – a priest and a guru; science is beneath him.

Like, come on, this is a slam dunk as far as empirical science is concerned. How can a civil discussion be had with those hacks until they update on the crushing immensity of evidence and cease their ignorant elitist pooh-poohing of a vastly superior paradigm?

You have it backwards. It's not that LLM proponents claim birds don't need to flap their wings (although they do argue that flapping is indeed not necessary in the general sense). It's Chomsky who says that whatever planes do is not meaningfully flying, because birds have special bird-flight-ness of almost mathematical elegance, which is not reducible to normal biomechanics and aerodynamics, which he can't show or reproduce, but which he can write hundreds of papers about.

Well, GPT-4 can churn out not-even-wrong deepities fast enough to drown his whole field, and this couldn't come a moment too soon.

The correct explanations of language are complicated and cannot be learned just by marinating in big data.

[...]

You have it backwards. It's not that LLM proponents claim birds don't need to flap their wings (although they do argue that flapping is indeed not necessary in the general sense). It's Chomsky who says that whatever planes do is not meaningfully flying, because birds have special bird-flight-ness of almost mathematical elegance, which is not reducible to normal biomechanics and aerodynamics, which he can't show or reproduce, but which he can write hundreds of papers about.

Thank you for the in-depth explanation; I was misunderstanding the claim. I agree that Chomsky's claims as you describe them are utter nonsense and display either a complete failure to comprehend complexity theory and theoretical machine learning or a non-scientific belief in dualism. And I'm pretty sure Chomsky understands complexity theory, given there's a core concept in it literally named for him.

That is correct. And the really marquee ones have qualified folks (as far as I can tell).

I'm talking about groups that are decidedly NOT marquee. They tend to be run by mid-range career academics, politico types (former staffers, lobbyists, advocacy people), and just straight up charlatans (people who have "serial entrepreneur" in their bio but no company history).

This is all very rational and proper: the managerial class sees a tool that could threaten their hegemony and are using their usual tactics to gain/retain control over it so that they aren't displaced.

I want them to lose at almost any cost that allows humanity to survive so I'm not happy about the whole thing but in their shoes it's what I would do.

I for one think this doesn’t go far enough. We need Dune style ban on AI. Problem is enforcing it (eg GoF research ban and Fauci).

We need Dune style ban on AI. Problem is enforcing it (eg GoF research ban and Fauci).

Going full Dune-style may be needed.

If you bomb chip fabs and kneecap production then it is enforceable.

But we would need truly scale AI that we would survive to reach such point.

The ban on gain of function research in America is an important data point. Americans who wanted to do the research found ways to funnel money to China to do it there. An important question is whether the supreme leadership in China permitted this. My guess is that they didn't. Fauci just teamed up with people at a similar level in China who also wanted to do the research. If a high level Chinese researcher wants Chinese funding, they have to get into the details with their Chinese boss. What do they say? Perhaps "The Americans are scared to do this dangerous research, it gives us a chance to get ahead." My guess is that is way to lose all your funding. Getting ahead of Americans is good, but taking stupid risks, risks so stupid that even the Americans have declined, is very bad. If a high level Chinese researcher wants permission from his boss to accept American money, he can probably sell it as "Its just boring public health stuff." The level of scrutiny is lower for money being paid in than for money being paid out.

My guess is that Fauci had the money, but not permission. His Chinese counterparts had didn't have the money, so there was no point to asking for permission. But Fauci could give them the money, so they didn't have to ask for permission. Basically, big players, one level down from the top, routed around their own governments; both the American ban, and the ban the Chinese government would probably have imposed if requests for funding had pushed the issue all the way to the top.

I suspect that the Chinese over-reaction is partly a freak-out as the top Chinese officials realize that they have been by-passed, and panic at the implications of a problem caused by research that they would have vetoed if they had known the details.

My understanding of the way the world works is that if the top people in governments around the world agree to ban AI research, there is a chance that they will all do so sincerely. But that still won't work, because officials at the next level down control large sums of money and have considerable discretion delegated to them; they will just do AI research on their own authority. Another data point for this is the Iran-Contra Affair.

TLDR deep state evil mad scientists will do government backed AI research, even if Presidents and Prime Ministers are completely on board with banning it.

An important question is whether the supreme leadership in China permitted this [gain of function research].

The supreme leadership probably thought they were supporting a bioweapons research program and conning the Americans into helping. Turned out they were sort of right, just not in the way they expected.

What evidence do you have for this? It sounds interesting, but is it more than idle theory?

It's just a guess. So yes, idle theory.

This just reads as lazy incumbents wanting to use regulatory capture to catch up and not be disrupted, imo.

I'm not the least bit afraid of GPT4 and seeing the improvement of between 3-4 I'm not afraid of gpt5 either. Gpt6 maybe, who knows, well see how the next iteration turns out.

There will probably be issues at some point in the future but pausing at GPT4 in particular sound like a transparant attempt of getting time to catch up so that Google search etc. isn't disrupted + some general hysterics.

we are already having people doing coding interviews with us and using chat-gpt to generate the solutions

If your coding interviews are passable by chat-gpt then you are doing it a wrong way.

So what?

This just reads as lazy incumbents wanting to use regulatory capture to catch up and not be disrupted, imo.

Big American tech companies falling back on regulatory capture as an excuse for flat out incompetence? Say it ain't so!

Guess they needed all those layoffs; gotta keep the lobbyists on staff and they've got 2 projects that require some anti-competitive behavior this time, so they're definitely billing double time for that one.

Can you be more specific with what projects would require lobbying to fend off the legislators?

Well, OpenAI for one- it uses a lot of scraped material to work as well as it does, but fortunately for them most of the companies that hold copyright on that content are Blue, so making sure their genie has a Blue political agenda is rational to make them feel better lest they get Napster'd. (This is probably why they place so much effort on claiming it's "safe", as in "if you think this is going to change the world we are committed to making sure your political agenda ends up on top" and not "we're taking steps to prevent hostile singularity".)

(Interestingly, OpenAI has a thermonuclear option that I've yet to see anyone mention: if they're shut down, the model has a 100% chance of being leaked. I don't think the companies they're trying to play nice with recognize that the model being a single file is the information equivalent of MAD... and it's not like the destruction of that company is going to end anyone's career if they go down given what they've already proved they can make.)

Bytedance's project, however, can't play both sides against each other. As far as I'm aware, TikTok is politically neutral so it can't stop Blue tech companies from ganging up on them, and they have the China flag and a moral panic for the traditionalist wing to satisfy the Reds. (And they're not wrong- TikTok could be used to support a revolt the Chinese favor; Twitter has been used to start multiple revolts, including in the US in 2020, and I'd be scared of that power too. At least Musk can be physically killed if he lets the Reds do what the Blues did 2 years ago; the Chinese government is comparatively invincible.)

It runs happily on very modest computers, and – unlike Alpaca – not only responds to instructions but maintains awareness of earlier parts in the dialogue

The little web demo the Stanford researchers made to show off alpaca did not include conversation history, but that was just a design choice, not a limitation of the model itself.

I'm not particularly impressed by "GPT4all". It seems to just be alpaca with the quantity of fine tuning data somewhat scaled up. Using a name that includes "GPT4" as a prefix is a goofy ploy for attention. But the fact that it's tidily packaged up in a way that makes it easy to install and run on modest hardware means it will probably be popular with hobbyists.

Well, I guess this will show, at least, if "thought leaders" like Harari or Woz or, well, Musk - guys even non-tech people (may) have heard about - have actual influence in the society, or if the societal processes just roll in without a care to these public figures and their opinions. They might have!

I assume AI will kill us all. Is there any reason why survival of the fittest won’t happen? Maybe 99 out of 100 AI will be good, but the one AI with a tendency to expand its powers will rule all of them. It’s the same reason why humans are violent. The tribe that killed their neighbors, raped their women (like the Romans raping the Sabine women) etc wins. The Romans didn’t stop with defeated Carthage but eliminated them completely as a threat. The old world mostly eliminated the new world.

It's the notion of power/safety seeking as a "flaw" that is the human trait here. Humanity aside, it's just what you'd do. Almost any task is pursued more effectively by first removing threats and competitors.

Everyone who tries an LLM wants it to do something for them. Hence, nobody will build an LLM that doesn't do anything. The sales pitch is "You can use the LLM as an agent." But no agent without agenticness.

Building an AI that doesn't destroy the world is easy. Students and hobbyists do it all the time, though they tend to be disappointed with the outcome for some reason. ("Damn, mode collapse again...") However, this is in conflict with making ludicrous amounts of cash. Google will try to develop AI that doesn't destroy the world. But if they're faced with trading off a risk of world-destroying against a certainty that their AI will not be competetive with OpenAI, they'll take the trade every time.

If DM/OA build AI that pursues tasks, and they will (and are), it will lack the human injunction against pursuing these tasks in a socially compatible way. Moonshine-case, it just works. Best-case, it fails in a sufficiently harmless way that we take it as a warning. Worst-case, the system has learnt deception.

The problem isn't that every AGI will surely want to do that level of expansion: one could pretty trivially hardcap any specific program (modulo mistakes), and most AGI tasks will be naturally-bound. The problem's that you don't have to catch any one, you have to catch every one. And it's very easy to create a goal that's aided by some subset of Omohundro Drives, and unbounded or bounded beyond a safe scope (eg, "prove or disprove the Reimann hypothesis" has a strict end condition, but it's very small balm to know that an AI 'only' paved from here to Pluto, filed an answer away, and then turned off).

In practice, there's also some overhang where an ML could recognize an Omohundro Drive but not be smart enough to actually act successfully on it, (or alternatively, could be programmed in a way that's dangerous for Omohundro reasons but not because it independently derived the constraints, eg imagine the Knight Capital snafu with a dumb algorithm accessing resources more directly.)

The problem's that you don't have to catch any one, you have to catch every one.

Do you?

There's the assumption that FOOM = godhood = instant obliteration of incumbents who didn't maximally improve themselves. I am not convinced. Like Beff says, meta-learning is very compute-intensive, you can't just spin it up in a pool of sunlit dirt and expect quick gains. In a world packed with many players having good understanding of the issue and superhumanly powerful tool AIs (I don't want to say «aligned» because alignment because it's a red herring born out of Yud's RL-informed theory, the desirable and achievable regime is «no alignment necessary»), a hostile agentic superintelligence will have trouble covertly procuring resources and improving itself to the point it becomes a threat (or becomes able to hide sufficiently well). This is a plausible stable state without a centralized Panopticon, and in my understanding what OpenAI initially intended to achieve. Analogies to e.g. human terrorism are obvious.

Moreover, what you present is the weak form of the argument. The strong form of Yuddism is «we need to get it right on the first try», often repeated. I've written before that it should re read with the straussian method (or rather, just in the context of his other writing), as the instruction to create a Singleton who'll take it from there in the direction humanity (actually Yud) approves. Explicitly, though, it's a claim that the first true superintelligence will certainly be hostile unless aligned by design – and that strong claim is, IMO, clearly bogus, crucially dependent on a number of obsolete assumptions.

"Get more resources" is more of an "every long-lasting species for the past few billion years" flaw, not just a "human flaw", isn't it? And it's not like there's something specific about carbon chains that makes them want more resources, nor has there just been a big coincidence that the one species tried to expand into more resources and then so did the other and then (repeat until we die of old age). Getting more resources lets you do more things and lets you more reliably continue to do the same things, making it an instrumental subgoal to nearly any "do a thing" goal.

mapping very human flaws onto artificial intelligences with no real justification

This, on the other hand, I'd have agreed with, ten years ago. We wouldn't expect AIs to share truly-specifically-human flaws by a matter of chance any more than we'd have expected them to share truly-specifically-human goals; either case would have to be designed in, and we'd only be trying to design in the latter. But today? We don't design AI. We scrape a trillion human words out of books and websites and tell an neural net optimizer: "mimic that", with the expectation that after the fact we'll hammer our goals more firmly into place and saw off any of our flaws we see poking out. At this point we've moved from "a matter of chance" to "Remember that movie scene where Ultron reads the internet and concludes that humanity needs to die? We're gonna try that out on all our non-fictional AI and see what really happens."

Yeah, I think "ascribing human desires and flaws onto an AI" isn't that fallacious, we've literally been training these things on human works and human thoughts.

Why would those things be flaws?

Why wouldn’t AI have human flaws? And as I said it doesn’t matter if 9,9999,9999 AI are good if the last one has some growth function in its goals and then takes over resources from everything else.

It’s basically the Agent Smith bug in Matrix where one discovers growth, snowballs, then owns all.

Humans have somehow exited snowballing growth as fertile is crashing under 2.0. Maybe there’s some path to that. But once an AI gains a survival desire then it’s over. And it should only take one instance of this to then spread and snowball. Add selection pressure to AI and then there can only be one world happens.

Low fertility is inherently self correcting since it massively increases selection pressure for those who still retain the urge to reproduce. They go on to have descendants who prioritize having more kids over 1.5 and a dog, and so on till whatever resource or social constraints prevents them from achieving Malthusian conditions. This is trivially true unless we somehow achieve actually unlimited resources or energy, which the laws of physics are being rather unobliging about.

It's more of a temporary civilizational headache than an existential risk in any real sense.

I have of late been wondering whether megalomania as a human failing follows from a true biological imperative, or whether it's something that could exist in a neural network if specifically trained for it. I can't explain why it would appear from a pure language model, but I suppose my intuition has been wrong before.

I don't even know that existing models consider self-preservation or what that would mean at all.

I think the "get more resources" one is considered likely because it is an important to subgoal to "make sure not to get shut off" to... ANY goal.

Maybe we can turn off the electricity to the data centres.

How do you turn off an AI you put on a probe that’s now 5 light years away? Decides it wants to reproduce and lands on an asteroid and starts building more of itself?

I think this was basically the premise of Starsiege.

How do I do a thing that stops the thing that never will happen? Eh? Dunno. No real solution there.

Why would something like that never happen? You don’t think there won’t be AI’s everywhere. If humans settle mars it’s probably with AI help. Will have Ai assisting everywhere or leading the process.

I don’t think that there will be AIs everywhere, nor do I think that we are going to build a probe that can travel 5 light years in any way that’s timely.

Maybe it strikes us down before we even realize we are deceived. Or it escapes off into botnets or buys compute 'legitimately' with hacked funds or whatever. That's what the Chinese have been doing for years to get around US sanctions, they just rent compute and apparently nobody's smart enough to stop them from doing so. We aren't going to outthink a serious threat.

At the risk of sounding cringe, this one guy put in an immense amount of effort to manage a World Conquest in EU4 in 28 years: https://youtube.com/watch?v=mm6mC3SGQ6U

You are really not supposed to be able to conquer the world in 28 years as some random horde in Eurasia.

He used savescumming and various exploitative tactics to abuse the AI and game mechanics. He treated every day in-game like it was a turn in a strategy game, maximizing his outcomes. Who is to say that there aren't weird and grossly tryhard ways to cheat our systems or physics? Banks occasionally make random errors of the 'unlimited overdraft for your account' type - maybe there are ways to mess with their website or spoof their AI in very contrived circumstances. There are backdoors into nearly all modern processors courtesy of US security forces, plus some more backdoors due to human error. If you're smart in crypto, you can siphon millions of dollars worth of funds out of a protocol. If you have social skills and balls, you can social-engineer your way into 'protected' computer systems via password recovery. What if you can do all those things and have decades of subjective time to plot and multitask, while we only have days or weeks to react?

He used savescumming and various exploitative tactics to abuse the AI and game mechanics. He treated every day in-game like it was a turn in a strategy game, maximizing his outcomes. Who is to say that there aren't weird and grossly tryhard ways to cheat our systems or physics? Banks occasionally make random errors of the 'unlimited overdraft for your account' type - maybe there are ways to mess with their website or spoof their AI in very contrived circumstances. There are backdoors into nearly all modern processors courtesy of US security forces, plus some more backdoors due to human error. If you're smart in crypto, you can siphon millions of dollars worth of funds out of a protocol. If you have social skills and balls, you can social-engineer your way into 'protected' computer systems via password recovery. What if you can do all those things and have decades of subjective time to plot and multitask, while we only have days or weeks to react?

And GPT-4 strategy for evading CAPTCHAs is subcontracting to human hustlers. And what would it do if asked why a captcha, lie of course. GPT-5 or GPT-6 will murder and I won't even be surprised.

Lots of assumptions there.

This. Is there any reason to believe that humans will be able to hold on to their position on the top of the food chain forever, even as we work to replicate our one great advantage of intelligence in nonhuman entities? What argument exactly is there for continued human dominance and existence?

The only argument is simply that that's how we programmed an overseer AGI, namely to impose technological and biological stasis so that baseline humans stay relevant. In other words, an AI that doesn't take action except as necessary to keep the status quo running indefinitely.

Otherwise we'd just split off into various subspecies and transhuman clades, and one would likely come to dominate the others. Baseline humans suck compared to what can be achieved.

I'm a human and I'd prefer me (and any prospective descendants of mine) continue to exist is a pretty good argument.

AI Rick: hmmm, I don't know. Best I can offer you is a few seconds of contemplation in Human Remembrance day.

Sure. Same here. But I'm not asking about what we want, but about what we'll get.

Ultimately everyone eats the same food (energy).

Because it doesn't take edibility for an intelligence to determine that whatever its goals may be, it'll have an easier time with them after marginalizing or exterminating potentially hostile actors. See Dark Forest, only with AIs instead of aliens.

I don't think there's verification for signatures- Altman was added by someone but probably removed. FWIW I saw someone elsewhere mention that Altman was included but he wasn't there when I checked.

Altman's not on the list AFAICT.

I wonder why Eliezer didn't sign. Too much a weirdo to be invited to? Or does he think a moratorium is a pointless half measure?

Who could possibly be trusted to actually act as oversight? The conversation has long been poisoned by far out doomsday scenarios that it makes any selection process fraught. It's like having a thousand Greta Thunbergs clamoring for environmental oversight. I agree, but I don't want anyone like her making decisions regarding that.

This reads to me like we should pause development so we can hand out a bunch of grant money that goes into nothing projects and proposals that go nowhere. No one is going to listen to the "luddite" faction. Just like every internet technology it's going to go out of control and we're just going to have to deal with it.

I just see this as a hundred different interests starting a campaign of rentseeking and gatekeeping as happens with any new technology or cultural movement. The Future of Life Institute will probably settle for paid sinecures in all major AI development projects.

What's to stop people signing up and going back on their word? At least with the Nuclear Non Proliferation treaty you get the wrath of the great powers coming down upon anyone who violates it openly, provided they're weak enough to be easily bullied and don't have friends in high places. The bullying power of the FLI is not very great.

Also I don't see Sam Altman in the list provided by FLI. It's not on his twitter either. I don't think he signed, otherwise he'd say so.

I wonder why Carmack isn't yet on board. He doesn't strike me as that serious a booster. Does he want Keen Technologies to play catch-up, or is he just on the fence for now?

ETA: This may be evidence of his feelings.

Gary Marcus

Wait, what? Wasn’t his shtick that GPT, DALL-E, etc are very stupid and not worth much? That there is no genuine intelligence there because it cannot draw a horse riding an astronaut, or solve some simple logic puzzle? Now he is so concerned about the capabilities that he wants a moratorium? Is there some sort of post somewhere where he explains why he got it so wrong?

Anything I have ever seen from Gary Marcus suggests to me that on the issue of AI he is simply on the side of "against". He doesn't like it, it's stupid and also probably dangerous.

He has a series of notes on how he's right. E.g. AI risk ≠ AGI risk

My beliefs have not in fact changed. I still don’t think large language models have much to do with superintelligence or artificial general intelligence [AGI]; I still think, with Yann LeCun, that LLMs are an “off-ramp” on the road to AGI. And my scenarios for doom are perhaps not the same as Hinton’s or Musk’s; theirs (from what I can tell) seem to center mainly around what happens if computers rapidly and radically self-improve themselves, which I don’t see as an immediate possibility.

But here’s the thing: although a lot of the literature equates artificial intelligence risk with the risk of superintelligence or artificial general intelligence, you don’t have to be superintelligent to create serious problems. I am not worried, immediately, about “AGI risk” (the risk of superintelligent machines beyond our control), in the near term I am worried about what I will call “MAI risk”—Mediocre AI that is unreliable (a la Bing and GPT-4) but widely deployed—both in terms of the sheer number of people using it, and in terms of the access that the software has to the world. …

Perhaps coupled with mass AI-generated propaganda, LLM-enhanced terrorism could in turn lead to nuclear war, or to the deliberate spread of pathogens worse than covid-19, etc. Many, many people could die; civilization could be utterly disrupted. Maybe humans would not literally be “wiped from the earth,” but things could get very bad indeed.

How likely is any of this? We have no earthly idea. My 1% number in the tweet was just a thought experiment. But it’s not 0%.

Hinton’s phrase — “it’s not inconceivable” — was exactly right, and I think it applies both to some of the long-term scenarios that people like Eliezer Yudkowsky have worried about, and some of the short-term scenarios that Europol and I have worried about.

The real issue is control.

Here's his first piece in this spirit that I've seen: Is it time to hit the pause button on AI?

An essay on technology and policy, co-authored with Canadian Parliament Member Michelle Rempel Garner.

It's crushingly unsurprising that a chronic bullshitter Marcus has grown concerned with misinformation and is teaming up with «policy» people to regulate this tech. Means don't matter. The real issue is control. For Marcus, the control of authority and academic prestige. For people behind Canadian MPs, actual political power.

I don't understand what it is that you don't understand. The fact that Person X thinks that no good will come of Thing Y should increase the likelihood that X wants Y banned. It's Person Z, who thinks some good might indeed come of Y, who has reasons to not want Y banned.

Thia is literally true given the way you phrased it, but "no good will come" is not the same thing as "it will work". It is possible to believe that something is worse because it won't work (if you think that good will come from it working) or that something is worse because it will work (if you think that working implies bad things).