This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
Is there anyone here who 1) thinks that AI x-risk is a threat that should be taken seriously, and 2) also thinks that this letter is a bad idea? If so, can you explain your reasoning? And also explain what restrictions on AI development you would support?
For a group of people who are allegedly very concerned with the possibility that AI will soon wipe out humanity, Rationalists are suspiciously resistant to any proposals for actually slowing and regulating AI development. A lot of the comments on this letter on LW and /r/ssc are very critical. If your stance is "I wish we could slow AI development and I support the letter in spirit, but I think it's unlikely to work", then that's one thing. But the critical comments seem to suggest that the comment authors either don't support any AI regulation at all, or else they're engaged in motivated reasoning to try to convince themselves that it's not even worth trying (e.g. "this letter will have a net negative impact due to its effect on capabilities researchers who don't like it" - the good ol' "if you fight your enemies, they win" tactic for concern trolling).
It lends support to my intuition that most AI x-riskers don't actually take the idea of x-risk very seriously, and on a gut level they think the benefits of AI are so likely to outweigh the downsides that there's no issue with pushing full steam ahead on capabilities research.
(Obviously if you're a full on utopian optimist and you consciously affirm that x-risk is not a serious threat, then there is no contradiction in your position and none of this applies to you.)
I'll raise my hand, here. I don't think current transformers are smart enough or self-guided enough to be an immediate concern and hard takeoff/FOOM always seemed handwavey in more than a virtue of silence sorta way, but I think there are a lot of concerns for how tools like this class could be incredibly destructive or directly or indirectly create true x-risk level concerns.
((And I do mean that in the technical 'humanity becoming extinct' sense, if not necessarily in the 'tile universe with paperclips' sense, rather than the 'oh, global warming would kill 1% of the world's population!' another-term-for-bad.))
At the trivial level and at the risk of linking to the Sequences, I'm gonna link to the sequences. The precise examples Scott picked aren't great, but "multipolar traps are hard to solve" is an old revelation even by the Sequence's warmed-over popsci standards. And that's only doubled down when many of the signatories here are governments, ie the people who'd be strongest incentivized and able to defect and obscure that defection.
This actually is also one of the bigger defenses rationalists have against Daseinstrudies or jonst0kes concerns about centralization/digital tyranny. You don't actually end up in a monopolar world just because there's some very strong multinational political consensus; you just end up in a world where there's you and the various political factions of the new world government. But that's not very soothing from a "don't blow up the world" one.
((At the very trivial level, I'd also point out that we don't publicly know the complexity of GPT4, so it's not clear anyone involved even knows what they're asking about. And while we can kinda make reasonable estimates for parameter count, I don't think parameter count is the sole useful distinction for ML 'intelligence'.))
At the deeper one, it's asking for a lot of things we don't actually know how to do, in any meaningful way, and may not be possible. "[P]rovenance and watermarking systems" are the closest thing to actively tried, and then found hard; we can't get them to work consistently even for the limited class of non-ML image works fighting a trivial number of porn consumers who aren't that technically adept and lack deep insight into methods. "[L]iability for AI-caused harm" just become tragicomedic given how bad dropshipping has gotten. "[R]obust public funding for technical AI safety research" is the most plausible, but only in the sense that we're demonstrably willing to spend a lot of money to actual frauds. The rest of the list aren't even coherent enough to consider failures.
I dunno. I'm willing to accept some pretty aggressive stuff if people are a) actually recognizing the level of impact they're proposing and b) willing to actually persuade people with compelling arguments. Probably not at "start WWIII over Taiwan", but I could understand hardware sales limits if we'd evidence of something like ML protein synth weapons.
If we're talking about putting an inspection port onto every GPU-equipped computer on the planet lest someone make a mean meme but not by hand, I'm a lot less impressed.
Is this in reference to AI art vs. the garden-variety online art theft done by t-shirt stores?
I'm also curious about this, I feel like I should know what you mean, but I can't parse it entirely:
This is also about AI art, right?
The argument applies there too, but I think the stronger case are matters like "Amazon sale of badly-designed equipment results in fire": the stakes are higher, there's a clear line of ownership and action and impact, and it's completely separate from any of the novel questions produced by ML or AI or online tech or even commercial speech. And yet liability for Amazon itself is inconsistent (compare success to failure); liability to the original sellers is difficult and seldom valuable, and liability to overseas original designers is nearly impossible.
And dropshipping and its problems are universal in online sales these days. I bring Amazon simply because it's the overt and obvious case, just as fires are the severe version. But fake products that are effectively outside of useful copyright protection or lemon laws, lesser dangerous or falsely marketed products, so on, as endemic.
It applies there, too: StableDiffusion's watermark process was a single line to comment-out, but Midjourney's watermarking has probably been defeated, and some non-ML projects have started encoding ML watermarks as a misguided anti-theft concept. There have been some efforts to try and 'watermark' GPT-generated text, or to produce some tool that can coherently predict if an image was generated (sometimes trying to ID model or prompt), and they don't work either. But there's a stronger argument where there's a far smaller userbase, the stakes are even lower, and the impact is trivial, and they're still losing the Red Queen's race.
One of the many many problems for online art or 'art' vendors is that it's trivial to sign up, scrape a site, and then repost that full scrape (and, for less ethical users, chargeback). There's a lot of broad communities that do nothing but that, or share already-copied content. Because most of the communities are public, if you know the sites for a particular interest fandom, you can pretty easily find a place where your work may be reposted. Some vendors just try to takedown notice those sites (when in jurisdictions that respond to takedown notices; see the first problem), but as an alternative some vendors have posted specific watermarks customized to individual customer accounts; if the vendor catches your account ID in an upload, they can now act against the individual actors (usually just by banning them).
PrimeLeap is an example of this technology; I don't know if it's the biggest or best-known. It's also been successfully defeated in a variety of ways that had little impact on image quality. Now, that's a low-stress environment on both sides of the aisle: just as the reposters are seldom the most technically adept, PrimeLeap doesn't exactly have a huge team of cryptographics PhDs. But it still seems like a useful metaphor.
More options
Context Copy link
More options
Context Copy link
Indeed, if it was, we'd have far less to worry about. We're hitting such huge parameter counts that it's hard to imagine an AGI quickly becoming more dangerous just by using a hundred times more parameters; the risk is that a slightly-superhuman AGI will find a much more efficient way to use the same parameters a hundred times more effectively and thereby suddenly discard the "slightly-".
More options
Context Copy link
More options
Context Copy link
What would be accomplished during a "six-month-pause" that would make it worth the enormous difficulty of getting that sort of international cooperation, even if the petition had any chance of success at all? Why should people concerned about unaligned AI consider this the best thing to spend their credibility and effort on? It's not like "alignment research" is some separate thing with a clear path forward, where if only we pause the AI training runs we'll have the time for a supercomputer to finish computing the Alignment Solution. Alignment researchers are stumbling around in the dark trying to think of ideas that will eventually help the AI developers when they hit superintelligence. Far more important to make sure that the first people to create a superintelligence consider "the superintelligence exterminates humanity" a real threat and try to guide their work accordingly, which if anything this interferes with by weakening the alignment-concerned faction within AI research. (The petition also talks about irrelevant and controversial nonsense like misinformation and automation, the last thing we want is alignment to be bureaucratized into a checklist of requirements for primitive AI while sidelining the real concern, or politicized into a euphemism for left-wing censorship.) Right now the leading AI research organization is run by people who started off trying to help AI alignment, that seems a lot better than the alternative! To quote Microsoft's "Sparks of Artificial General Intelligence: Early experiments with GPT-4" paper:
Here is the baseline: if the first people to create superintelligence aren't concerned with alignment, there's a decent chance they will deliberately give it "agency and intrinsic motivations". (Not that I'm saying the Microsoft researchers necessarily would, maybe they only said that because LLMs are so far from superintelligence, but it isn't a promising sign.) Personally I'm inclined to believe that there's no reason a superintelligent AI needs to have goals, which would make "create a Tool AI and then ask it to suggest solutions to alignment" the most promising alignment method. But even if you think otherwise, surely the difference between having superintelligence developed by researchers who take alignment seriously and researchers who think "lets try giving the prospective superintelligence intrinsic motivations and write a paper about what happens!" matters a lot more than whatever "alignment researchers" are going to come up with in 6 months.
More options
Context Copy link
I have a position somewhat similar to what I believe was previously expressed by @DaseindustriesLtd in that I think that the X-risk is real, but there are outcomes that are barely better than total human eradication, with "eternal tyranny by US schoolmarm 'aligned' machine god" being one of them. Letting development go ahead full steam now, as I see it, slightly increases the "existential catastrophe" probability, massively decreases the "in Soviet America, AGI aligns you" scenario probability, and massively increases the "bloody global drone wars that result in broken supply chains and a sufficiently reshuffled gameboard that we at least get to reroll the dice on the nature of our future machine gods" probability. The last event, unpleasant as it is, is less unpleasant than the middle one, so my personal choice of doomerism is to bet on that one.
More options
Context Copy link
Here's someone's reasoning from LessWrong that was interesting:
Although I'd guess that LLMs are likely to be a major component of the first AGI, I'd agree that neither "predict the next token from this short context window" nor "oh, but do it in a way that our RLHF reward model likes" is going to get there by itself. Six months of "well, we have to use the LLM we have now; should we twiddle our thumbs, or depress ourselves doing safety work, or just play around seeing what we can plug it into?" might not drive our stymied LLM-training researchers to options A or B.
More options
Context Copy link
Shooting yourself in the foot is a great way to ensure that someone else finishes first.
Someone who doesn’t share the signatories’ caution.
More options
Context Copy link
I think there's a possibility that we're not yet to "hardware overhang", that alignment research will only progress if it has high-capabilities test models to work on, and that we're going to reach "hardware overhang" in the next few decades. If all three premises are true, then going full-speed-ahead to AGI now, when the closest it could get to "going foom" would involve directing the construction of new chip fabs from scratch, would be the only chance of success at alignment research. Slowing down now would just mean that, once someone eventually breaks the pact and hits AGI, it will already be superhuman and inhumanly dangerous.
I think "we're not yet to hardware overhang" is the weakest of those premises, though. I'm not even sure we could have avoided reaching hardware overhang via the techniques we're using; if you've got enough compute to train a model via massive brute force, you've necessarily got more than enough compute to run the model at superhuman speeds. The limiting factor right now is that we don't know how to make a general model superhuman; we can go into a "self-play" phase when training a Chess or Go AI and watch it take off, but there's no "self-play" for reality.
You're probably right. The technique of using the model's own outputs, and its assessment of those outputs, to further tune itself probably caps out at a fairly small amount of gain, if you don't ground the model in further interactions with the physical world.
Oh, yeah; I'd expect that sort of "self-play" to get peak model performance from "average human responses" to "best-human responses recognizable by an average human". And the economic effects of getting there, even if it caps out there, might be astounding. But frankly if we just end up having to retool 90% of the economy I'd be glad to see it, when the alternatives include extinction-level scenarios.
I think the most "real" thing I can imagine high-speed self-play for is math (theorem proving). Adding an inference step to a computer-verifiable theorem seems like it's as basic an output as choosing a move in a board game, and coming up with "interesting theorems" seems like it could be done automatically (some function combining "how short is it to state" with "how long is the shortest proof"?), and yet an AI training to play that "game" might eventually come up with something more useful than a high ELO.
Pretty sure self-play would be viable for leetcode-style or project-euler-style programming problems too, if you give it access to an interpreter. Or just any task where recognizing a good output can by done by a less capable language model than it takes to generate an output that good.
Recognizing good output is half the problem; generating an enormous array of problems is important too. With complex board games every non-deterministic opponent (including every iteration of an AI during training) is a fount of problems; with math the new problems practically generate themselves as conjectures based on previous problems' intermediate definitions. I don't see how to come up with millions of independent programming problems automatically. That might just be a failure of my imagination (or more charitably, I just haven't thought about the idea for long enough), though.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
So... would you recommend we just stop AI development altogether, or do you disagree with the pause?
More options
Context Copy link
"I see my car is heading towards a cliff, once I start to fall, either I die an inevitable relatively painless death, or I survive. The first option is not worth thinking about, therefore I should plan for the second..."
You're right that AI safety rules won't stop a 10000 IQ machine from killing us if it wants to... But the whole point of AI safety is making it not want to kill us, just like you shouldn't drive over cliffs.
It is not obvious that it wants to kill us. Assuming it does, it is even less obvious that "making it not want to kill us" is something that can actually be done. I am pretty sure that the philosophy undergirding "coherent extrapolated volition" is dead wrong, and that attempts to "align" a superintelligent AI will converge inevitably on horror. The entire worldview that spawned this memeplex has been invalidated by subsequent real-world developments, the culture war being foremost among them.
More options
Context Copy link
More options
Context Copy link
This is where I'm at, too. I honestly think it was too late by the time the first machine that could be called a calculator was created; the dominoes were already falling at that point, and only something almost as bad, like nuclear holocaust, was going to be enough of a barrier - and likely even that would have only delayed it. Given that, we're not in control of our own destiny with respect to whether or not AI will destroy us all, and if so, our best option is to party until the lights go out. And the faster the development of AI is and more easily accessible it is to everyone, the better that party is. And if the lights don't go out, then faster development of AI and greater accessibility of AI is also good for future generations. So either way, I want AI development to go forward with few impediments.
More options
Context Copy link
Based on your comment further downthread about the unlikelihood of paperclip maximizers, I don’t think that you think that there’s a very serious risk of an AI extinction event. So it’s consistent for you to take a more cavalier attitude towards capabilities research.
More options
Context Copy link
1. Yeah, but not in the way Rats tend to think about it. Their way of thinking tends to be very alien to me.
2. Primarily I am a Butlerian Jihadist, and would like to get rid of the whole thing. Secondarily I lean towards DaseIndustrialism (your question might stem from being unfamiliar with his worldview), the regulations I would support would be a ban on closed source / closed data AI, and subsidies towards open source initiatives.
I've never understood this view - do you really think the Internet and all the technology we've created post 1990 is net-negative?
If so, why? Or is it just based on a concern for AGI?
I'm ok with the Internet, but things started going downhill with the introduction of convenient "one stop shopping" platforms operating through opaque recommendation algorithms.
Also, seeing the impact of the internet on generations that haven't seen a world without it, would make me utterly unsurprised at it turning out to be a net negative.
I don't know if I even believe in AGI, my issue is with current AI technology. It has the potential to dumb us down, and give the establishment the tools to shape our ideas and discourse like nothing before it.
Hopefully we’re just in a local minima but I agree with you in terms of monopolies in tech. I’m optimistic that LLM progress and medium term space travel will alleviate some of the monopolistic tendencies in our current economic system.
I don't think "we have planetary monopolies on media, but we colonized several planets" helps. Everybody torrenting the ever loving fuck out of media (or using IRCinstead of WhatsApp) looks like something that might plausibly alleviate the technologies negative social consequences.
With LLMs I'm more pessimistic. The issues might be fundamental to the technology itself, I don't see how it's a good idea to outsource your reasoning skills to a machine. Imagine having an exoskeleton so small you could wear it as a second skin, which gives you superhuman strength. Sounds great, but your actual body is going to atrophy.
But the only thing worse than having a machine that atrophies your brain, is having a machine that atrophies your brain and is under the control of hostile actors, so maximum proliferation is the next best thing after a total ban.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link