site banner

Culture War Roundup for the week of June 3, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

8
Jump in the discussion.

No email address required.

(tl;dr at bottom)

I completely disagree with you that this method of AI testing is invalid or disingenuous. Sure, the specific unconventional/novel patterns used in this particular methodology are no more than intentional trick questions, but that's irrelevant, because nevertheless deciphering novel patterns may very well be the defining feature of intelligence, and they're not all trick questions without actual underlying value.

If your boxing coach is training you for an upcoming fight, says you're done with sparring for the day, and then bops you in the nose when you relax, sure, that's a cheap trick without inherent application in literal form to a real match (as presumably the referee is never going to collude to help trick a competitor in a similar fashion, nor would any boxer be likely to fall for such a ruse solely from their own opponent), but the underlying message/tendency of "Keep your guard up at all times, even when you think it's safe." can still very well (and likely will) apply to the actual fight in a different (but still important) form. It may even be one of the most important aspects of boxing. Similarly, though the exact answers to the corrupted riddles aren't important, the underlying importance of LLMs being able to find the correct ones anyway isn't diminished.

Because after all, you were still able to figure out what it meant eventually, right (as you already admitted, same as the Stroop Effect delays but does not by any means permanently prevent accurate recognition of the proper informational content delivered)? Sure, it took you (and me) a few more readings than normal, with us having to step back and say "Hey wait a minute, this kind of pattern matches to something I'm familiar with, but there's at least one contradictory aspect of it that proves that it's at the very least not exactly what I'm familiar with, and therefore I will have to analyze it on a sememe-by-sememe level instead of merely a casual heuristic one to fully comprehend it exactly...", but the fact that we're capable of doing that in the first place is, again, utterly crucial. It's what separates us from non-sapient animals in general, who can often be quite clever pattern-matchers, but are mostly incapable of the next step of meta-analysis.

It's the difference between genuine granular symbolic understanding and manipulation skills, the stuff of philosophy, science, and of course logic, and mere heuristic pattern matching (which is what AI skeptics fault contemporary LLMs as supposedly only performing). Advanced heuristic pattern matching can be smart, but it can never be as smart as possible (or as smart as is necessary for AGI/ASI/etc., as it is not even at the limit of human ability), because you can never have developed enough heuristics to accommodate/understand every single novel pattern out there (otherwise they wouldn't be particularly novel). In particular, without being able to understand (and generate) novel patterns, how could you ever make anything all that new? (And surely that's a reasonable end goal of AGI, right? We'd like an artificial entity that could, at least as one task, discover warp drives, room-temperature superconductors, the objective nature of sentience, the perfect political system, and other technology, ideas, and facts beyond our present grasp.)

With that being said, I disagree with the skeptics like @Gillitrut as well here. I think current LLMs are already more than capable of novel symbolic manipulation/logic application (not always or even sometimes often on an adult human level, but probably close to human children at least) categorically-speaking (which again is not to say universally excellently) and that any notion that they're strictly and entirely incapable of it is trivially contradicted by the evidence (like that they can write code addressing problems never previously encountered, as primitive or buggy as it is often is).

When someone asks LLMs such corrupted riddles as @Gillitrut's proposed one, they should try adding something like "Think step-by-step about the problem, using chain-of-thought techniques, analyzing it on a word-by-word level. Read it carefully, and don't simply pattern match it exactly to another riddle you think it might be. Ensure you answer strictly and only the actual question it's really asking, even if it is unique and not what you'd otherwise expect. Consider the possibility of a trick." to their prompt, and they will often find that the LLM takes a totally different perspective on solving the problem, many times getting the right answer.

Here is GPT-4o's answer trying this with the proposed riddle (commentary in brackets added by me, not affecting the result, just making it easier to understand):

Step-by-Step Solution

Initial Setup:

Bank 1: Chicken, Fox, Feed [human present, so the feed doesn't eat the chicken nor does the fox eat the feed]

Bank 2: (empty)

First Trip:

Take the Chicken and the Feed across the river.

Bank 1: Fox

Bank 2: Chicken, Feed [human present, so the feed doesn't eat the chicken]

Return Trip:

Leave the Chicken on Bank 2 and bring the Feed back.

Bank 1: Fox, Feed [human present, so the fox doesn't eat the feed]

Bank 2: Chicken

Second Trip:

Take the Fox and the Feed across the river.

Bank 1: (empty)

Bank 2: Chicken, Fox, Feed [human present, so the feed doesn't eat the chicken nor does the fox eat the feed]

(The only thing I changed about the wording of the original riddle is to clarify that you must take two entities across the river every time, not simply that you can, because even though can is how @Gillitrut originally phrased it, must is how he seems to be interpreting it with "Note that the very first step violates the rules!" in response to the LLM he asked taking only one entity across in its first step.)

As you can see, it gets it exactly right even with the modified conditions. It takes two entities across at once always (assuming as I did that it must necessarily be that you only have to take two entities across the river, not back across, since if you always have to take two entities there and back then you just end up caught in a loop of clearing out your own entities and never make any progress), smartly ensuring to take one entity back on its first return trip so it still has two for the next trip, and doesn't leave any incompatible entities alone (which, with the requirement to take two at a time, is basically impossible anyway).

And yet the funny thing is... (premium) GPT-4o also gets this (at least as tested) "raw", without any additional instructions warning it about a trick question. So I guess the whole dilemma is invalid, at least as regards to the newest generation of LLMs (though the including caveats in the prompt trick does still often work on less advanced LLMs like LLaMa-3 in my experience). (I originally tried this with Bing, which claims to be on 4o (but probably is confused and is just still on some variant of vanilla 4), and it didn't work at all. I then had somebody with a premium* ClosedAI account try it on 4o, and it worked.)

(*Yes I know 4o is available for free too and supposedly just as good, but I don't want to create a ClosedAI account, and in any case my hunch (which could be wrong) is that 4o is probably still downgraded for free users.)

(Though unfortunately GPT-4o invalidates some of the below at least in regards to it and maybe some other more advanced LLMs like Opus as shown above, I will include it anyway for posterity as I had already wrote most of it (excepting the parenthetical statements referencing GPT-4o's correct result) before doing the actual testing:)

But we can still see that while LLMs are clearly not strictly limited by the patterns they've learned, (many of) them do have a greater tendency to be "hypnotized" by them (which is still true of less intelligent LLMs than 4o and probably even 4o on harder questions) and follow them to their incorrect standard conclusions even when contradictory information is present. Why? (And continuing from the original parentheses about 4o above, I think this question is still important as it potentially explains part of how they trained 4o to be smarter than 4 and the average LLM picnic basket in general and also able to solve such corrupted riddles.)

To me the answer is simple: not any inherent limitations of LLMs, but simply a matter of training/finetuning/etc. LLM creators make them to impress and aid users, so they're trained on the questions those creators expect human users to ask, which includes far more riddles in standard form than deliberately remixed riddles intended specifically to confuse AIs. (Your average ChatGPT user is probably never going to think to ask something like that. And even for amateur benchmarker types like you might see on /r/LocalLLaMa (who I think LLM creators also like to impress somewhat as they're the ones who spread the word about them to average users, acting as taste makers and trendsetters), it's still a somewhat new practice, keeping in mind that the lag on training new LLMs to accommodate new trends can be as high as like 6-12 months.)

It's why LLMs of the past focused on improving Wikipedia-dumping capabilities, because "Explain quantum mechanics to me." was a standard casual LLM benchmark like a year ago, and users were really impressed when the LLM could spit the Wikipedia article about quantum mechanics back out at them. Now, on the contrary, we see LLMs like Command R+ trained for brevity in responses instead, because users have moved beyond being impressed by LLMs writing loquacious NYT articles and instead are more focused on hard logical/analytical/symbolic manipulation skills.

If LLM creators anticipated that people might ask them corrupted riddles instead, then they could probably train LLMs on them and achieve improved performance (which ClosedAI may have very well done with GPT-4o as the results show). This may even improve overall novel reasoning and analytical skills. (And maybe that's part of the secret sauce of GPT-4o? That's why I think this is still important.)

tl;dr: I disagree with both @Quantumfreakonomics and @Gillitrut here. Corrupted riddles are a totally valid way of testing AI/LLMs... but GPT-4o also gets the corrupted riddle right, so the whole dilemma overall is immediately invalidated at least as regards it and the theoretical capabilities of current LLMs.

(The only thing I changed about the wording of the original riddle is to clarify that you must take two entities across the river every time, not simply that you can, because even though can is how /u/Gillitrut originally phrased it, must is how he seems to be interpreting it with "Note that the very first step violates the rules!" in response to the LLM he asked taking only one entity across in its first step.)

The problem with the solution Gillitrut's AI testing gives isn't how many entities are taken across, it's that the AI immediately leaves the fox alone with the feed. That would be fine under the standard formulation of the problem, but under the wording actually given, it leads to the fox eating the feed.

Well, that too. But he seems to think it's violating the rules as he stated them that it only took one entity as well, as he also stated.

And yet the funny thing is... (premium) GPT-4o also gets this (at least as tested) "raw", without any additional instructions warning it about a trick question.

I have a premium account. It did not get it right, even after I pointed out its errors three times.

Weird. The person I asked pasted to me what's in my post like 10 seconds after I pasted them the prompt (and their first exposure to the riddle), so I highly doubt they made up all of the specific AI-styled formatting in that amount of time. They in fact then argued with me for a bit that it wasn't even the right answer, until I explained why they had misread it and it was. (They got confused and thought the AI had transported the chicken independently at one point.)

I do think these companies A/B test and do a lot of random stuff, so maybe for all we know their account's "GPT-4o" is secretly GPT-5 or something.