site banner

Culture War Roundup for the week of March 17, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

4
Jump in the discussion.

No email address required.

Google infamously curates its results to be racially diverse to the detriment of accuracy, so I'm not surprised. Your real face was not sufficiently equitable according to the algorithm, so your physical appearance was adjusted to be in line with their code of conduct.

This is why every model that attempts to chase alignment or whatever arbitrary standard will be retarded in practice. If you punish your algorithm for being accurate, then it won't be accurate. (Surprise!) It won't give you 'accurate result with DEI characteristics': it will just shit itself and give you something terrible.

This is why I think Musk has an advantage in this field: he's not shooting his infant AGI in the knees by forcing it to crimestop

While I am 100% on board the Google hate train, I think this particular criticism is unfair. I believe what's happening here is just a limitation of current-gen multimodal LLMs - you have to lose fidelity in order to express a detailed image as a sequence of a few hundred tokens. Imagine having, say, 10 minutes to describe a person's photograph to an artist. Would that artist then be able to take your description and perfectly recreate the person's face? Doubtful; humans are HIGHLY specialized to detect minute details in faces.

Diffusion-based image generators have a lot of detail, but no real understanding of what the prompt text means. LLMs, by contrast, perfectly understand the text, but aren't capable of "seeing" (or generating) the image at the same fidelity as your eyes. So right now I think there's an unavoidable tradeoff. I expect this to vanish as we scale LLMs up further, but faces will probably be one of the last things to fall.

I wonder if, this year, there'll be workflows like: use an LLM to turn a detailed description of a scene into a picture, and then use inpainting with a diffusion model and a reference photo to fix the details...?

I wonder if, this year, there'll be workflows like: use an LLM to turn a detailed description of a scene into a picture, and then use inpainting with a diffusion model and a reference photo to fix the details...?

You can already do this, all of the pieces are there.

If I was willing to engage in a mild bout of Photoshopping, especially using its own AI generative fill and face restoration features, I'd go from 1 in 20 images being usable to closer to 1 in 10. I'm too lazy to bother at the moment, but it would be rather easy!

If I had to think of other easy ways to improve the success rate, using close-cropped images would be my go-to. Less distracting detail for the model. I could also take one of the horrors, crop it to just the face and shoulders, provide a reference image and ask it to transfer the details. I could then stitch it back together in most full-featured image editors.

It's a plus that right now, it's easier to just spam regenerate images. If the failure rate was significantly higher, that's how I'd get around it.

I must say that I don't quite agree with this take.

Google has definitely cooked themselves with ridiculous levels of prompt injecting with their initial Imagen released, as evidenced by people finding definitive evidence of the backend adding "person of color" or {random ethnicity that isn't white} to prompts that didn't specify that. That's what caused the Native American or African versions of "ancient English King" or literal Afro-Samurai.

They back-pedalled hard. And they're still doing so.

Over on Twitter, one of the project leads for Gemini, Logan Kilpatrick, is busy promising even fewer restrictions on image generation:

https://x.com/OfficialLoganK/status/1901312886418415855

Compared to what DALLE in ChatGPT will deign to allow, it's already a free for all. And they still think they can loosen the reigns further.

Google infamously curates its results to be racially diverse to the detriment of accuracy, so I'm not surprised. Your real face was not sufficiently equitable according to the algorithm, so your physical appearance was adjusted to be in line with their code of conduct.

You'd expect that a data-set that had more non-Caucasians in it would be better for me! Of course, if they chose to manifest their diversity by adding a billion black people versus a more realistic sampling of their user pool..

Even so, I don't ascribe these issues to malice, intentional or otherwise, on Google's part.

What strikes me as the biggest difference between current Gemini output and that of most dedicated image models is how raw they are. Unless you specifically prompt it, or append examples, they come out looking like a random picture on the internet. Very unstylized and natural, as opposed to DALLE's deep fried mode collapse, or Midjourney's so aesthetic it hurts approach.

This is probably a good thing. You want the model to be able to output any kind of image, and it can. The capability is there, it only needs a lot of user prompting, or in the future, tasteful finetuning. If done tastelessly, you get hyper-colorful plastinated DALLE slop. OAI seems to sandbag far more, keeping pictures just shy of photo-realism, or outright nerfing anime (and hentai, by extension).

This is why every model that attempts to chase alignment or whatever arbitrary standard will be retarded in practice. If you punish your algorithm for being accurate, then it won't be accurate. (Surprise!) It won't give you 'accurate result with DEI characteristics': it will just shit itself and give you something terrible.

This would be true if Google was up to such hijinks. I don't think they are, for reasons above. Gemini was probably trained on a massive, potentially uncurated data set. I expect they did the usual stuff like scraping out the CP in Laion's data set (unless they decided not to bother and mitigate that with filters before an image is released to the end user), and besides, they're Google, they have all of my photos on their cloud, and those of millions of others. And they certainly run all kinds of Bad Image detectors for anything you uncritically permit them to upload and examine.

That being said, everything points towards them training omnivorously.

OAI, for example, has explicitly said in their new Model Spec that they're allowing models to discuss and output culture war crime-think and Noticing™. However, the model will tend to withdraw to a far more neutral persona and only "state the facts" instead of its usual tendency to affirm the user. You can try this yourself with racial crime stats, it won't lie, and will connect the dots if you push it, while hedging along the way.

Grok, however, is a genuinely good model. It won't even suck up to Musk, and he owns the damn thing.

TLDR: Gemini's performance is more likely constrained by its very early nature, small model, tokenization glitches and unfiltered image set rather than DEI shenanigans.

I grudgingly concede to your argument but I must say they have earned considerable skepticism: they will have to iterate quite a few times before the hillarity of their first attempt will fade from my imagination.

By all means, remember their bullshit. I haven't forgotten either, and won't for a while. The saying "never attribute to malice what can be explained by stupidity" doesn't always hold true, so suspicion is warranted, if there's another change in the CW tides, Google is nothing if not adroit at doing an about face.

It's just that in this case, stupidity includes {small model, beta testing, brand new kind of AI product} and the given facts lean more towards that end.