site banner

Culture War Roundup for the week of March 10, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

5
Jump in the discussion.

No email address required.

Manus is a generic thin wrapper over a strong Western model (Sonnet 3.7), if a bit better executed than most, and I am quite unhappy about this squandering of DeepSeek's cultural victory. The developers are not deeply technical and have instead invested a lot into hype, with invites to influencers and creating a secondary invite market, cherrypicked demos aimed at low value add SEO-style content creation (eg “write a course on how to grow your audience on X”) and pretty UX. Its performance on GAIA is already almost replicated by this opensource repo. This is the China we know and don't very much love, the non-DeepSeek baseline: tacky self-promotion, jumping on trends, rent-seeking, mystification. In my tests is hallucinates a lot – even in tasks where naked Sonnet can catch those same hallucinations.

The real analogy to DeepSeek is that, like R1 was the first time laymen used to 4o-mini or 3.5-turbo level slop got a glimpse of a SoTA reasoner in a free app, this is the first time laymen have been exposed to a strong-ish agent system, integrating all features that make sense at this stage – web browsing, pdf parsing, code sandbox, multi-document editing… but ultimately it's just a wrapper bringing out some lower bound of the underlying LLM's latent capability. Accordingly it has no moat, and does not benefit China particularly.

Ah well. R2 will wash all that away.

It is interesting, however, that people seemed to have no clue just how good and useful LLMs already are, probably due to lack of imagination. They are not really chatbot machines, they can execute sophisticated operations on any token sequences, if you just give them the chance to do so.

people seemed to have no clue just how good and useful LLMs already are, probably due to lack of imagination. They are not really chatbot machines, they can execute sophisticated operations on any token sequences, if you just give them the chance to do so.

100% agree. I think even most commenters here seem fairly oblivious to all you can get out of LLMs. A lot of people try some use case X, it doesn't work, and they conclude that LLMs can't do X, when in fact it's a skill issue. There is a surprisingly steep learning curve with LLMs and unless you're putting in at least a couple of hours a week tinkering then you're going to miss their full capabilities.

Can you expand on this? Which hidden capabilities are you referring to? I've been a daily user of LLMs since early ChatGPT came out, and I'm not so sure what do you mean here.

Specific examples may sound underwhelming, because I'm mainly talking about metis in the sense of James C Scott, i.e. habitual patterns of understanding and behaviour that are acquired through repeated experience. For example, I almost never run into hallucinations these days, but that's because I've internalised what kinds of queries are most likely to generate hallucinated answers, so I don't ask them in the first place (just like we all know you don't do a Google search for "when is my Aunt Linda's birthday"). But I realise that sounds like a cop-out, so here are some examples -

  • Customisation. Developing a bank of different custom instructions that you can swap in or swap out for specific queries.
  • Using LLMs to refine prompts. Don't just ask questions, spend some time refining the question with LLMs.
  • Identifying novel non-professional use cases. So many LLM applications in everyday life, from home improvement to social dilemmas to budgeting.
  • Preemptive negative prompts. Identifying common failure modes in advance or unproductive directions that LLMs are likely to go, and explicitly steering models away from them
  • Model switching. Recognising the personalities and capabilities of different models and slotting content between them to optimise workflows.
  • Advanced voice. So many people just type their queries to LLMs, but there are plenty of use cases and contexts where voice gets you different/better results. See also the audio overviews of NotebookLM.

To give a real-world example of the latter two ideas in action, I recently had a very complicated situation at work involving a dozen different colleagues, lots of institutional rules and politics, and a long history. My goal output was a 4 page strategy document spelling out how we were going to handle the situation. However, typing out the full background would be a big hassle. So instead I had a 60 minute voice conversation with ChatGPT while on a long walk, in which I sketched the situation and told it to keep asking follow up questions until it really had a good handle on the history and key players and relevant dynamics. So we did that, and then I asked it to produce the strategy document. However, I didn't completely love the style, and I thought Claude might do a better job. So instead, I asked ChatGPT to produce a 10 page detailed summary of our entire conversation. I then copypasted that into Claude and told it to turn it into a strategy doc. It did a perfect job.

So, a relatively simple example, but illustrates how voice mode and model switching can work well.