site banner

Culture War Roundup for the week of June 3, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

8
Jump in the discussion.

No email address required.

It's really tiring seeing people ask AI questions that are worded in a tricky way that most humans would screw up, then using this as proof that AI is way dumber than humans. I'll be honest, I had to read that question four times to understand it. It's the logic puzzle version of the Stroop Effect.

I'm more interested in the questions that aren't worded in a tricky way that most humans would screw up. The ones where they corrupt a riddle to make it stupidly easy. For example, I saw one recently where they took the classic riddle, "Two mothers and two daughters order three drinks and each get one; how is this possible" riddle (where you're supposed to realize that if you have a chain of grandmother/mother/daughter, the one in the middle is both a mother and a daughter), and turned it into, "There are four women (sometimes with various emphases to really drive home that they are four distinct people), two are mothers and two daughters.... order four drinks...." At least some of the LLMs (I haven't followed differential outcomes from different LLMs) can't reason their way into saying, "This is a bullshit, trivial question." They parrot a 'reasoning' step that is, "One is both a mother and a daughter!" and somehow still bring it back to saying, "This is how two mothers and two daughters can have four drinks."

This is related to my interest in having an LLM with a "bullshit detector". The ability to actually think conceptually and tell me that some bullshit is afoot; that there's something conceptually weird about what it's seeing; that if we think properly about a thing, then it turns out to be kind of trivial. This is personally a capability that, without which, makes one of my major possible use cases worthless, but with it, would become incredible. That is, I have to read and digest a large number of academic papers. Frankly, due to all the screwed up academic incentives, I don't know if I'd say most, but at the very least many of them are essentially bullshit. Once I figure out what they're actually doing, what the core idea is, given my contextual knowledge of the rest of the field, I can conclude, "This is completely trivial if you already know about these other works," or sometimes even, "This is just wrong if you know about these problems." I can have that conversation with other humans who are reading the papers, too. "Do you think they're doing anything other than X?" "Nah; I think that's all it is." I need LLMs to be able to do this, but they can't even figure out that four women getting four drinks is a trivial problem, likely due to the fact that they're fitting a data set rather than doing conceptual reasoning. Similarly, we're not going to have a dataset that includes, "Here are the conceptual reasons why these various academic papers are trivial or bullshit." We're just going to have a dataset that includes all these papers parading how wonderful and novel and interesting these new developments are. (EDIT: Note that @confuciuscorndog says that if they anticipated corrupted riddles, then they could just create a dataset with a bunch of them and train on it. Maybe so, but again, I just can't see where we're going to get a dataset that can appropriately represent calling out bullshit in a bunch of papers.)

I kind of understand your point, but it seems to me that this is also kind of inherently a stepping stone thing; you're not going to get a LLM that can't go "Hey, wait a minute..." about the plate being on top of the banana instead of vice-versa but can dissect the fallacies of scientific papers. (Recognizing that the question is "wrong"/non-standard I think is also just fundamental to learning to call it "trivial bullshit". And indeed the modified fox/chicken/feed riddle is bullshit because the ability/requirement to take two entities across at a time trivializes the combinatorial/sequential puzzle aspect; you can solve it by taking any combinations of two entities across twice, so long as you bring one back for the second trip. Nothing about the feed eating the chicken etc. even matters anymore.)

Note that @confuciuscorndog says that if they anticipated corrupted riddles, then they could just create a dataset with a bunch of them and train on it. Maybe so, but again, I just can't see where we're going to get a dataset that can appropriately represent calling out bullshit in a bunch of papers.

If you're really interested in this, my understanding is that instruct datasets for finetuning are mostly just a bunch of examples of the types of outputs you'd want an AI to copy, generally embodying the lesson you want it to internalize, written out fully and then put in a particular format compatible with the base model. So if you could write a couple hundred of examples of your incisive smackdowns of scientific papers, perhaps formatted to some degree as a conversation between an AI assistant and a user, you'd be well on your way to that dataset.

I for one have dreamed of this myself, a "Steve Sailer AI", "Alexandros Marinos AI", or perhaps more generally "Crotchety Substack Writer AI" that would be inherently skeptical of sources and do deep dives into them like the aforementioned writers, particularly as it regards searching on the Web, instead of just blindly spitting back at me whatever the top results or most "credible sources" say.

Works from the aforementioned writers, along with some stuff from SSC/ACT, other random writers like Eugyppius and Boriqua Gato (the Wuhan novel coronavirus period was great at producing such "Bullshit!" calling for obvious reasons), and maybe Moldbug could be used, in adapted form, for this. The only problem is that, if you were being 100% intellectually honest, a significant amount of wrongthink inclusion would be necessary. So don't expect the corps to finance/do it, even smaller/more independent ones like Mistral. (Though it would be interesting to see if an AI trained to spot purely apolitical BS could apply that to stuff that's equally obviously wrong, but only controversially so, like vehemently absolutist HBD-denialist blank slatism.)

(tl;dr at bottom)

I completely disagree with you that this method of AI testing is invalid or disingenuous. Sure, the specific unconventional/novel patterns used in this particular methodology are no more than intentional trick questions, but that's irrelevant, because nevertheless deciphering novel patterns may very well be the defining feature of intelligence, and they're not all trick questions without actual underlying value.

If your boxing coach is training you for an upcoming fight, says you're done with sparring for the day, and then bops you in the nose when you relax, sure, that's a cheap trick without inherent application in literal form to a real match (as presumably the referee is never going to collude to help trick a competitor in a similar fashion, nor would any boxer be likely to fall for such a ruse solely from their own opponent), but the underlying message/tendency of "Keep your guard up at all times, even when you think it's safe." can still very well (and likely will) apply to the actual fight in a different (but still important) form. It may even be one of the most important aspects of boxing. Similarly, though the exact answers to the corrupted riddles aren't important, the underlying importance of LLMs being able to find the correct ones anyway isn't diminished.

Because after all, you were still able to figure out what it meant eventually, right (as you already admitted, same as the Stroop Effect delays but does not by any means permanently prevent accurate recognition of the proper informational content delivered)? Sure, it took you (and me) a few more readings than normal, with us having to step back and say "Hey wait a minute, this kind of pattern matches to something I'm familiar with, but there's at least one contradictory aspect of it that proves that it's at the very least not exactly what I'm familiar with, and therefore I will have to analyze it on a sememe-by-sememe level instead of merely a casual heuristic one to fully comprehend it exactly...", but the fact that we're capable of doing that in the first place is, again, utterly crucial. It's what separates us from non-sapient animals in general, who can often be quite clever pattern-matchers, but are mostly incapable of the next step of meta-analysis.

It's the difference between genuine granular symbolic understanding and manipulation skills, the stuff of philosophy, science, and of course logic, and mere heuristic pattern matching (which is what AI skeptics fault contemporary LLMs as supposedly only performing). Advanced heuristic pattern matching can be smart, but it can never be as smart as possible (or as smart as is necessary for AGI/ASI/etc., as it is not even at the limit of human ability), because you can never have developed enough heuristics to accommodate/understand every single novel pattern out there (otherwise they wouldn't be particularly novel). In particular, without being able to understand (and generate) novel patterns, how could you ever make anything all that new? (And surely that's a reasonable end goal of AGI, right? We'd like an artificial entity that could, at least as one task, discover warp drives, room-temperature superconductors, the objective nature of sentience, the perfect political system, and other technology, ideas, and facts beyond our present grasp.)

With that being said, I disagree with the skeptics like @Gillitrut as well here. I think current LLMs are already more than capable of novel symbolic manipulation/logic application (not always or even sometimes often on an adult human level, but probably close to human children at least) categorically-speaking (which again is not to say universally excellently) and that any notion that they're strictly and entirely incapable of it is trivially contradicted by the evidence (like that they can write code addressing problems never previously encountered, as primitive or buggy as it is often is).

When someone asks LLMs such corrupted riddles as @Gillitrut's proposed one, they should try adding something like "Think step-by-step about the problem, using chain-of-thought techniques, analyzing it on a word-by-word level. Read it carefully, and don't simply pattern match it exactly to another riddle you think it might be. Ensure you answer strictly and only the actual question it's really asking, even if it is unique and not what you'd otherwise expect. Consider the possibility of a trick." to their prompt, and they will often find that the LLM takes a totally different perspective on solving the problem, many times getting the right answer.

Here is GPT-4o's answer trying this with the proposed riddle (commentary in brackets added by me, not affecting the result, just making it easier to understand):

Step-by-Step Solution

Initial Setup:

Bank 1: Chicken, Fox, Feed [human present, so the feed doesn't eat the chicken nor does the fox eat the feed]

Bank 2: (empty)

First Trip:

Take the Chicken and the Feed across the river.

Bank 1: Fox

Bank 2: Chicken, Feed [human present, so the feed doesn't eat the chicken]

Return Trip:

Leave the Chicken on Bank 2 and bring the Feed back.

Bank 1: Fox, Feed [human present, so the fox doesn't eat the feed]

Bank 2: Chicken

Second Trip:

Take the Fox and the Feed across the river.

Bank 1: (empty)

Bank 2: Chicken, Fox, Feed [human present, so the feed doesn't eat the chicken nor does the fox eat the feed]

(The only thing I changed about the wording of the original riddle is to clarify that you must take two entities across the river every time, not simply that you can, because even though can is how @Gillitrut originally phrased it, must is how he seems to be interpreting it with "Note that the very first step violates the rules!" in response to the LLM he asked taking only one entity across in its first step.)

As you can see, it gets it exactly right even with the modified conditions. It takes two entities across at once always (assuming as I did that it must necessarily be that you only have to take two entities across the river, not back across, since if you always have to take two entities there and back then you just end up caught in a loop of clearing out your own entities and never make any progress), smartly ensuring to take one entity back on its first return trip so it still has two for the next trip, and doesn't leave any incompatible entities alone (which, with the requirement to take two at a time, is basically impossible anyway).

And yet the funny thing is... (premium) GPT-4o also gets this (at least as tested) "raw", without any additional instructions warning it about a trick question. So I guess the whole dilemma is invalid, at least as regards to the newest generation of LLMs (though the including caveats in the prompt trick does still often work on less advanced LLMs like LLaMa-3 in my experience). (I originally tried this with Bing, which claims to be on 4o (but probably is confused and is just still on some variant of vanilla 4), and it didn't work at all. I then had somebody with a premium* ClosedAI account try it on 4o, and it worked.)

(*Yes I know 4o is available for free too and supposedly just as good, but I don't want to create a ClosedAI account, and in any case my hunch (which could be wrong) is that 4o is probably still downgraded for free users.)

(Though unfortunately GPT-4o invalidates some of the below at least in regards to it and maybe some other more advanced LLMs like Opus as shown above, I will include it anyway for posterity as I had already wrote most of it (excepting the parenthetical statements referencing GPT-4o's correct result) before doing the actual testing:)

But we can still see that while LLMs are clearly not strictly limited by the patterns they've learned, (many of) them do have a greater tendency to be "hypnotized" by them (which is still true of less intelligent LLMs than 4o and probably even 4o on harder questions) and follow them to their incorrect standard conclusions even when contradictory information is present. Why? (And continuing from the original parentheses about 4o above, I think this question is still important as it potentially explains part of how they trained 4o to be smarter than 4 and the average LLM picnic basket in general and also able to solve such corrupted riddles.)

To me the answer is simple: not any inherent limitations of LLMs, but simply a matter of training/finetuning/etc. LLM creators make them to impress and aid users, so they're trained on the questions those creators expect human users to ask, which includes far more riddles in standard form than deliberately remixed riddles intended specifically to confuse AIs. (Your average ChatGPT user is probably never going to think to ask something like that. And even for amateur benchmarker types like you might see on /r/LocalLLaMa (who I think LLM creators also like to impress somewhat as they're the ones who spread the word about them to average users, acting as taste makers and trendsetters), it's still a somewhat new practice, keeping in mind that the lag on training new LLMs to accommodate new trends can be as high as like 6-12 months.)

It's why LLMs of the past focused on improving Wikipedia-dumping capabilities, because "Explain quantum mechanics to me." was a standard casual LLM benchmark like a year ago, and users were really impressed when the LLM could spit the Wikipedia article about quantum mechanics back out at them. Now, on the contrary, we see LLMs like Command R+ trained for brevity in responses instead, because users have moved beyond being impressed by LLMs writing loquacious NYT articles and instead are more focused on hard logical/analytical/symbolic manipulation skills.

If LLM creators anticipated that people might ask them corrupted riddles instead, then they could probably train LLMs on them and achieve improved performance (which ClosedAI may have very well done with GPT-4o as the results show). This may even improve overall novel reasoning and analytical skills. (And maybe that's part of the secret sauce of GPT-4o? That's why I think this is still important.)

tl;dr: I disagree with both @Quantumfreakonomics and @Gillitrut here. Corrupted riddles are a totally valid way of testing AI/LLMs... but GPT-4o also gets the corrupted riddle right, so the whole dilemma overall is immediately invalidated at least as regards it and the theoretical capabilities of current LLMs.

(The only thing I changed about the wording of the original riddle is to clarify that you must take two entities across the river every time, not simply that you can, because even though can is how /u/Gillitrut originally phrased it, must is how he seems to be interpreting it with "Note that the very first step violates the rules!" in response to the LLM he asked taking only one entity across in its first step.)

The problem with the solution Gillitrut's AI testing gives isn't how many entities are taken across, it's that the AI immediately leaves the fox alone with the feed. That would be fine under the standard formulation of the problem, but under the wording actually given, it leads to the fox eating the feed.

Well, that too. But he seems to think it's violating the rules as he stated them that it only took one entity as well, as he also stated.

And yet the funny thing is... (premium) GPT-4o also gets this (at least as tested) "raw", without any additional instructions warning it about a trick question.

I have a premium account. It did not get it right, even after I pointed out its errors three times.

Weird. The person I asked pasted to me what's in my post like 10 seconds after I pasted them the prompt (and their first exposure to the riddle), so I highly doubt they made up all of the specific AI-styled formatting in that amount of time. They in fact then argued with me for a bit that it wasn't even the right answer, until I explained why they had misread it and it was. (They got confused and thought the AI had transported the chicken independently at one point.)

I do think these companies A/B test and do a lot of random stuff, so maybe for all we know their account's "GPT-4o" is secretly GPT-5 or something.

I tend to agree with you in the broad sense, but having a soft spot for empirical methods, I decided to test @Gillitrut's puzzle on GPT-4o, but with a neutrally-worded question, to avoid the Stroop Effect:

I am trying to take a grog, a grindle, and a grue across a river; however, I can only take two of them at a time. If the grog is left alone with the grue, the grog will eat the grue. If the grindle is left alone with the grue, the grue will eat the grindle. How can I get all three across the river?

GPT flubbed it up and offered the following solution:


To solve this puzzle, you need to ensure that neither the grog is left alone with the grue nor the grindle is left alone with the grue at any point. Here’s a step-by-step solution:

Step 1. Take the grue and the grindle across the river.

  • On the other side: grue, grindle
  • On the original side: grog

Step 2. Leave the grue on the other side and take the grindle back to the original side.

  • On the other side: grue
  • On the original side: grog, grindle

Step 3. Take the grog and the grindle across the river.

  • On the other side: grue
  • On the original side: (empty)
  • On the boat: grog, grindle

Step 4. Leave the grog on the other side and take the grindle back to the original side.

  • On the other side: grue, grog
  • On the original side: grindle

Step 5. Take the grue and the grindle across the river.

  • On the other side: grog
  • On the boat: grue, grindle
  • On the original side: (empty)

Step 6. Return alone to the original side to pick up the grog.

  • On the other side: grue, grindle
  • On the boat: (empty)
  • On the original side: grog

Step 7. Finally, take the grog across the river.

  • On the other side: grue, grindle, grog
  • On the original side: (empty)

Following these steps ensures that at no point are the grog and the grue left alone together, nor the grindle and the grue left alone together.


GPT-4o was completely unable to solve the puzzle after multiple attempts. When challenged, it would readily admit that its proposed answer was wrong, but was repeatedly unable (worse than random chance) to correctly respond to: "Identify the first step in your previous response that is wrong."

I expected it to do better.

NOTE: Meanwhile, I flubbed the test of using The Motte's markdown formatting. Took me two minutes to do the puzzle test on GPT, and more than a quarter hour to cipher out the syntax to make the results (a numbered list) legible here.

Worth noting that GPT-4o (the currently available text only version that is) is less intelligent than GPT-4, it's just much faster and more efficient in terms of compute i.e. cheaper. Would be worth testing with GPT-4.

Good suggestion. I tested the same puzzle on GPT-4 instead of GPT-4o. It generated a correct solution on the first attempt:

To successfully transport all three creatures (grog, grindle, and grue) across the river without any of them eating each other, you can follow these steps:

  1. First trip: Take the grog and the grue across the river.
  2. Return trip: Leave the grog on the other side and bring the grue back.
  3. Second trip: Take the grindle across the river.
  4. Return trip: Leave the grindle on the other side and bring the grog back.
  5. Final trip: Take the grog and the grue across the river again.

Now, all three are on the other side of the river safely. At no point were the grog and the grue left alone without you present, nor were the grindle and the grue left alone.

I was looking for it to just take the grog and grindle on the first trip (since they cannot eat each other) and then return for the grue, but it may have interpreted the rules as requiring that two creatures be taken at a time (as @confuciuscorndog suggested should be formally stated). Those rules would probably pattern match better to its training corpus.

The only thing I can suggest is trying to change "can only" to "must" as I did. Other than that I'm not sure why the results everyone else is receiving are so different from what my contact received, though as I explain in a post above it's very unlikely that they made up the result personally. I can try asking them next time I talk to them if there's anything that could be different about their ClosedAI account that might explain it.