This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
A chess AI that plays the game by making random moves has an elo of 478 and will occasionally beat a novice, which usually have an elo around 800. A dice is not AGI.
We have had AI that can beat you in chess 20 out of 20 times since the 90s. Not only did this AI not become AGI, but it is also now very much recognised as a dead end of development even for chess AI.
Pokemon is such an easy game that it can conceivably be beaten with entirely random inputs, and provably beaten by very-close-to random inputs. It's the ideal case for a video game that a primitive general intelligence would be good at. It does not require reactions or timing, it has very limited controls and interactions, and being incredibly slow and persistent gradually makes the only challenge easier as you inevitably outlevel everything from blundering around in the tall grass for too long. Twitch Plays Pokemon was essentially built on this premise.
From OP's description it's not actually clear that Claude plays Pokemon at a level that's much above buttonmashing, and there's strategies that are both superior to buttonmashing and also not intelligent (to name one, a biased buttonmash).
Twitch Plays Pokemon only won because of a couple of rule changes that put the lie to this claim.
Without #1 they would have "lost" over and over again. Without #2 several lategame dungeons (one of which, Victory Road, is required) are nearly impossible due to rock-pushing and ledge-jumping puzzles where a decent chunk of progress gets wiped out by an errant input. I'm not even sure that you'd win eventually with 100% probability with random inputs (which is ordinarily the case), because I'm not 100% sure whether it's possible to corrupt the game permanently with some of the crazier glitches in RBY (I know it's possible to do arbitrary code execution, just not whether the hardware actually has the capability to ruin the cartridge permanently).
I didn't claim that Twitch Plays Pokemon actually beat the game with random inputs, just "very-close-to" random.
I think claude is also barred from #1 because I don't think the software interface ever gave the power to hold down A, B, Start and Select simultaneously (assuming that's the combination you mean). The description of how it works only refers to button sequences, not holding multiple buttons at once. As for #2, performance so far suggests that Claude will struggle at this in pretty much the way you'd expect the actually random input method to also struggle, because it's one of the few segments of the game that can't be cleared with button mashing.
Impressively, I don't think the original run of Twitch Plays Pokemon ran into any notable glitches, despite the game being so buggy that many of them are plausible occurrences during casual playthroughs.
More options
Context Copy link
More options
Context Copy link
I believe that was a joke.
At any rate, even sticking to chess, I used an elo calculator and the dumb chess AI would win 13.55% of games. I still think it would be rather impressive if a dog make valid moves, even if at random.
When the first chess bots came out, public opinion was far from acknowledging the possibility that they even might become superhuman at Chess. Today, we're at the point where even grandmasters are utterly crushed by Stockfish.
If you have a few million or billion years to wait around I suppose.
As far as I'm aware, the spectators interacting with the stream were using strategies and had an idea of how to win at the game. There were plenty of trolls or awful players, but it wasn't random or too random.
A dog (and humans) playing a digital chess game generally won't have the option to play an illegal move and lose by doing so. The pokemon-playing LLM has already been given a similar advantage by not getting disqualified for invalid moves like "remove batteries" or even "throw the gameboy across the room" because such commands will just get ignored by the software interface.
I do not expect that Claude, under any circumstances, would express a desire to remove batteries or throw the GameBoy away in a fit of rage. It has the ability to represent said desire in text, if nothing else.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link