@Aransentin comments on "Culture War Roundup for the week of August 12, 2024

Culture War Roundup for the week of August 12, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

Aransentin p ≥ 0.05 zombie 3mo ago · Edited 3mo ago

It should be noted that the Grok image generation is just a wrapper calling the open "Flux" model behind the scenes: https://x.com/bfl_ml/status/1823614223622062151

Anything Grok can generate for you, you can generate yourself manually on your own computer (given a sufficiently beefy GPU) with zero guardrails since you can give it any text you want.

Context

Chrisprattalpharaptr Ave Imperaptor Aransentin 3mo ago

How beefy? I thought I was looking into trying to run alphafold or some of the other structural bio models a year or two ago and we were talking like 20k. Is it easier to run inference on the image generation models or was I just stupid?

erwgv3g34 Chrisprattalpharaptr 3mo ago

Both local FLUX models require 24 GB of VRAM uncompressed. You can buy a used 3090 with that much for $750 or a new one for $1,200. Or just rent time from an online GPU company like RunPod. And that's before you start getting into the quantized models; FLUX is really affordable.

ThisIsSin The boob tube emits harmful XXX-rays erwgv3g34 3mo ago

Honestly I’m amazed that the 4070, being a 3090 with half the RAM, sold as well as it has. Though I stop being amazed once I talk to people that by all rights should know better.

It’s worth noting that even with a 3090 it’s still running in low-RAM mode anyway since the encoder has to fit in there too; but you can get it running better if you care to optimize it (or just use the 3090 as a secondary GPU) or just use the 8-bit quantization. It takes around 40 seconds per image at 1024px square.

gattsuru ThisIsSin 3mo ago

You're still looking at ~850 USD for a 3090 new today today, compared to around 600 USD for an nVidia 4070, plus the increased size and power/heat. Not a great deal if you need the VRAM, but it's not clear you do need it yet. I'd err in favor of futureproofing, but if you've got One Game You're Gonna Play for seriously for the next couple years, I could see the argument.

Though the various suffixes are more clearly dumb. 4070 TI Super ~~Omega Super Saiyin Blue~~ is at the same price point, nearly the same size, and similar power profiles, so looks like you're trading slightly better DRSS for less VRAM?

ThisIsSin The boob tube emits harmful XXX-rays gattsuru 3mo ago

I do beg to differ on the VRAM, specifically because other GPUs in the same price class all carry 24GB (and any next-gen console is going to have unified memory or at 16GB of VRAM, I suspect), and after that it’s just the wanting it now, not after optimization. And while I do get that you’re still not fitting a 70B LLM in 24GB, it’s still going to have an impact as far as needing to swap with main memory goes, or at least that’s my impression.

It’s also worth noting that a 3090 is no less capable in AI tasks than a 4090, 5090, or 6090 will be, simply because a card with more VRAM than 24GB won’t be released for a long, long time due to market segmentation.

The market seems to have caught onto that fact, unfortunately.

ActuallyATleilaxuGhola Axolotl Tank Class of '21 ThisIsSin 3mo ago

a card with more VRAM than 24GB won’t be released for a long, long time due to market segmentation.

What does this mean?

urquan ActuallyATleilaxuGhola 3mo ago

They want to be able to charge enterprises and server operators massively more than ordinary consumers, even for very similar silicon, because they can afford it, and the product is more valuable to enterprises doing work on GPUs than to consumers playing Hogwarts Legacy or dicking around with open source AI models.

The goal is to charge each customer as close to their maximum price as possible. That's much higher for enterprises, so the graphics companies load features that are especially valuable to them on specialty cards that cost much, much more than consumer cards without them. Much of this strategy has collapsed lately, with features like GPU vm passthrough coming to consumer cards, so they've reoriented this strategy to promoting using clusters of cards to get max VRAM for enterprises, which consumer platforms cannot accommodate.

CPU manufacturers do it too, which is why it's very hard to use ECC RAM on a computer unless you shell out for a workstation/server grade platform.

VoxelVexillologist Multidimensional Radical Centrist urquan 3mo ago

than to consumers playing Hogwarts Legacy or dicking around with open source AI models.

Somewhere in here is an idea to prompt AAA game studios to develop games that require huge amounts of VRAM so that GPU manufacturers are elbowed into offering consumer cards that can do this. But that will take time, and for all I know looks like some sort of time-persistent AI shading model ("game rendered in the style of Van Gogh").

More comments

ThisIsSin The boob tube emits harmful XXX-rays ActuallyATleilaxuGhola 3mo ago

It is in nVidia’s interest that cheap GPUs don’t get very much VRAM, because the limiting factor for how smart and performant an LLM/image generation model can be is how fast it can access the matrices. If you could get 80GB of VRAM, which is eminently reasonable, on a card for 2000 dollars then no business would buy the overpriced purpose-built cards.

gattsuru Chrisprattalpharaptr 3mo ago

Most image generators are a good deal more manageable : FLUX.1 dev is 24GB, but people have got it running on mobile GPUs with less than 4GB VRAM. It's slower -- a couple minutes per generation, as opposed to the <20 sec for running on a nvidia 3060 or equivalent -- but it's usable.

A good part of that is just that image generation models quantize better than LLMs, without become as 'dumb', so you can run down to fp8 (8-bit) with relatively little loss of information, and nf4 is good enough for a lot of uses even if notably different. But imagegen has also had a lot more software work done to do partial staging and some CPU offloading, in the casual sphere.

07mk Chrisprattalpharaptr 3mo ago

I don't know about Grok's image gen specifically, but having used Stable Diffusion for almost 2 years now, I can tell you that the cutting edge image generation stuff can be run reasonably fast (about 30-60s per a batch of 4 512x512 images) on a 5 year old gaming PC with an Nvidia 1070 that I was already using as my home computer, without any upgrades. I did upgrade to a more modern gaming PC with a 4090 last year, which can do a batch of 4 512x512 images in a few seconds. The entire new PC I bought, primarily for gaming, was around $4,500, with the largest chunk of that coming from the GPU, which you could probably cheapen out on with a 4080 or a 3090 and get plenty good performance.

SkoomaDentist 07mk 3mo ago

I’ve been using Stable Diffusion on a 5 year old second hand laptop where the gpu was basically a ”well, might as well get it since the extra cost is just 50e” type of thing. Combine that with preconfigured uncensored cloud rental services and unrestricted image generation is ridiculously affordable if you care at all.

2D3D Aransentin 3mo ago

Honestly I see alot of handwringing about the potential for AI and how we are crippling AIs true end state because of woke/chud moral tainting of these computer models, but you can go to pixiv to see the absolute tsunami of unrestrained AI generated there, and the Chinese are even more degenerate. Unrestrained AI exists, its just too useless at this moment to do anything other than churn out degenerate pornography.

Blueberry Aransentin 3mo ago

Their pro version (used by X?) is closed source though:

https://blog.fal.ai/flux-the-largest-open-sourced-text2img-model-now-available-on-fal/

FLUX.1 [pro]: A closed-source version only available through API. fal Playground here.

Aransentin p ≥ 0.05 zombie Blueberry 3mo ago

Yeah, the question is if Grok is really using that, and if so how much better it is compared to Dev. It could even plausibly be worse if it's simply a bigger variant of Schnell.

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats