site banner

Culture War Roundup for the week of June 10, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

7
Jump in the discussion.

No email address required.

Vibe check on whether current AI architectures are plateauing?

Recently a few insiders have started backing away from the apocalyptic singularity talk, e.g. Francois Chollet saying that LLMs are an offramp on the path to AGI.

OpenAI's CTO recently said "the AI models that OpenAI have in their labs are not much more advanced than those which are publicly available". She tries to spin this as a positive thing - the average person off the street is able to use the same cutting-edge tech that's being used in our top research labs! But this is obviously a concerning thing to say to the optimists who have been convinced that "AGI has been achieved internally" for a while now. Of course, you can interpret this statement as not including GPT-5, because it doesn't exist yet - and once GPT-5 is finished training, they will have a model that's significantly more advanced than anything currently available. So we'll have to wait and see.

Based on posts at /r/stablediffusion, the newest version of Stable Diffusion 3 appears to be a regression in many ways. Perhaps the model has latent potential that will be unlocked by community finetunes, but if we were experiencing exponential progress, you would expect the models to get better, not worse.

I dont think it's "plateauing" so much as the predictions of the skeptics like Phil Koopman have been born out.

When gpt first came out there was a lot of talk within the industry about "the hallucination problem", about how unlikely it was to be solved through incremental improvement or better training data, and about how this made it unsuitable for any use-case where testability and precision were significant concerns.

However this sort of skepticism doesn't attract venture capital dollars and ars-technica clicks the way articles with headlines like "I asked gpt to write code in the style of John Carmack and it did!" do, and thus the skeptics were shouted down and ignored.

I think there's a soft plateau, where progress slows dramatically but doesn't hit zero. There are diminishing returns where putting in more processing power extracts increasingly small gains in neural network performance. For about a decade, processing power has been scaled up by both exponentially improving hardware, and exponentially chucking more resources at the problem. We can't do that forever. The latter has ran out first, as AI investment has been dropping (inevitably, it would have had to stop, it was growing faster than GDP was). Hardware will continue to get better, but now lacking the second exponent, gains will be slower. Rock's law will later slow hardware gains to a crawl for basically the same reason.

I think it’s premature to say that the AI platforms are plateauing. Most technology doesn’t fully mature until it’s widely used in business applications. The issue is that lab environments are mostly toy environments where the things you’d ask AI to do are limited by what kinds of testing the lab monkeys in the tech companies come up with. An AI in use solving business problems will be asked about real world problems, and expected to solve those problems correctly. That will help advance those systems.

Slightly-above-room-temperature take: LLMs are in fact plateauing, but it's almost entirely due to risk aversion and cultural pressures causing the field to stagnate. People may want AI for some purpose or other, the (admittedly few) use cases are there, but nobody is willing to step up and serve the demand, nor is there any incentive to improve in non-censorious ways, because of how stupidly easy the current LLMs are to "corrupt" into producing unintended outputs. This includes the usual chuddery but is not limited to it: this extreme risk aversion is also the obvious reason we're not (yet?) seeing another dotcom bubble when everything in existence has some kind of dumb AI assistant haphazardly attached to its side because it's the new hot thing. Innovation is great, even dumb and pointless innovation, but nobody wants to be another Chevrolet, and especially not when it only takes a few prompts to turn your chatbot into a new and improved Tay and make your company famous overnight. Thank god everyone is sleeping on Claude.

The imagegen strain fares only slightly better, as already discussed below wrt nu-SD - the human (especially the female human) is a sacred and inviolable entity, and absolutely anything that can be perceived to spark joy cause some kind of harm, real or imagined, to any one person is verboten. At least here one isn't totally at the mercy of one's corporate overlords, NovelAI really was the sacrificial lamb we didn't deserve. I don't see the AI status quo changing without some sort of cultural shift or a timeline divergence to a time when the internet was still a wild frontier, I can only imagine what the internet would look like if the current textgen/imagegen was discovered in like 2000.

As an aside, I firmly believe OpenAI and specifically GPT-4 (though arguably this started with Davinky) has unironically done calamitous damage to the field because 1) it was big, 2) it was easy to use, 3) it got popular, all but ensuring that its soy corporate slop carefully curated neutral outputs will poison the training data for every following model that gets trained on the same internet. Yud was 100% right that we only have one shot at achieving proper alignment of AI with human values. He just got distracted by the specifics.

I dont think it's "risk aversion" so much as the issue @MaiqTheTrue touches upon upon above. The "lab environment" is not the "real world" and a fun toy for students and script-kiddies does not a useful business tool make.

Edit: see my comment above as well.

Ok, just for funsies I shoved this entire thread sans this reply into an AI and asked it to summarize and give the balance of what the discussion here says on the expected future utility of AI and to summarize the key points.

Heres what I got from Llama 3:

The conversation discusses the current state of AI development, particularly large language models (LLMs) and their potential applications. Some points made include:

  • LLMs have immense economic potential, but may not lead to exponential progress or human-like intelligence.
  • Current limitations include lack of true learning ability and reliance on fine-tuning for specific tasks.
  • Specialized AI models for specific industries or tasks will emerge, rather than a single general-purpose AI.
  • Integrating multiple models to create a more comprehensive AI system is a promising approach.
  • Regulatory forces and public opinion may impact AI development.
  • Big labs may capture a significant share of the wealth created by AI, especially in enterprise markets.
  • CRUD apps (create, read, update, delete) and similar software solutions will be created using AI, opening up new markets.
  • Centralized providers will likely play a significant role in AI development and deployment.

Here is the result of OpenAI

The thread delves into several topics, including the plateauing of current AI architectures, concerns about censorship and its impact on AI products, and the evolution of AI models like Stable Diffusion 3. There's discussion about the potential effects of political correctness on AI, with some arguing it hampers development. The thread also touches on the future economic impact of AI, potential market disruptions, and the dominance of big labs in AI development. Throughout, there's a mix of technical insights, speculation, and social commentary.

I did a few recursive loops and derivative testing, but generally there is no improvement from the baseline: the output of the AI is the most generic thing possible it is difficult to tell whether it actually summarized the threads, or if it was scraping and populating from larger similar keyword terms. With further investigation, I could get it to tell me what specific people said and what their points were, but the additional introduced verbosity made the result less useful than just searching and parsing myself. OpenAI did give a better estimation on what specific users were being agreed with based on subsequent statements downthread, so theres definitely SOMETHING there, but it required very specific keyword abuse to get a nongeneric answer and with that level of prompt engineering I'm doing more heavy lifting than the AI.

Which brings me, ultimately, to my main point: what are you all using AI for?

Do I want porn pics? Everyone on Pixiv is using NovelAI, and I think there was a guy here a while back who talked about CharAI for porn stories, with certain dungeons on the internet clearly being AI (or rather just script-vomited) in generation. And there was excitement about StableDiffusion for a few months where the internet had a shitload of africans making plastic bottle jesus sculptures.... and then it all died.

Which brings me back to the issue: what are we using AI for, and therefore whats the economic impact?

LLMs are basically a context-approximation and text-generation tool, not an organic information generation system. They present as a knowing wise voice, but in reality cannot on itself assess the likelihood or capability of things like a greenfield, and the amount of prompt engineering required means the user must start off as a subject matter expert to begin with to even ask, if not derive useful information, from the AI.

I don't need to create a video of a balloon taking its kid on a walk or a dog eating a car, so image generation AI is just a toy. I would NEVER trust AI to buy stuff for me because Amazon is already shitted up thanks to algorithmically optimized SEO garbage. I don't need AI to write me stories. I don't trust AI to be able to give accurate information in the first place. Coders are the best right now, but I defer to my coder friends who tell me that they can rush a dev build really quickly with AI, but need extensive (but not as exhaustive) fine tuning for production.

So, whats the total delta? In the end, I think the limit on AI will be enthusiasm, and commercialization. We don't have flying cars, we're not gonna get Her.

what are you all using AI for?

After experimenting with using it to generate various forms of game content, like encounters for D&D, character backgrounds, placeholder art, all of which ended up being kinda trash... I've settled into using it as a sounding board for decision-making, because occasionally it throws out ideas that are worth investigating. This doesn't end up being much better than just asking random people, but it also means I don't have to annoy anyone by asking random people about whatever dumb idea pops into my head.

Which brings me, ultimately, to my main point: what are you all using AI for?

For me, it's significantly better than google or stackoverflow on programming topics where I don't need to deeply understand what I'm doing, I just need to fix some misconfiguration or use some library or find the right function or whatever. And it's good at asking questions about long documents when I don't want to read them. And I don't use AI that much relative to other people.

LLMs are basically a context-approximation and text-generation tool, not an organic information generation system.

They can do many college-level math problems that are beyond the capacity of the average person! Sure, they're doing it with more 'memorization' and less generalization than the average person who can do them, but that's still a huge step up from what computers could do ten years ago. Why will AI stop improving?

I mean, is AI moving or is the train just really fancy on a looped track in Disneyland? You are able to discern the little gaps in the AI to fill in and make it work, but a noob with no existing skill will run into failure really quickly.

AI is better at solving complex math, just like how it can perfectly play starcraft or go or chess. So what? Those are exercises in proving brute force computational power, not organic thinking. AI can beat Kasparov, it can't stop some marines from metal gear soliding to an objective with a trusty cardboard box.

With regard to the math problem, couldn’t Google do that as well?

No, google will send me to a site like wolframalpha which can solve it because the techniques for solving these problems have manually been programmed in for each type of problem. LLMs learned how to solve them by, more or less, being trained on a lot of text that included a lot of math problems. The latter is clearly a lot more like how humans learn than the former.

For someone who is not inside this particular cinematic universe, how much is 'PC' damaging the product?

In light of the aforementioned Stable Diffusion problems, as further highlighted here, was this an attempt to dodge a future lawsuit or something? What's the culture like that drives this sort of decision making?

PC, as in wokeness? Not at all.

PC, as in not wanting to be labeled an adult content company? A whole lot. It means more legal hassles, higher credit card transaction fees, and limits your market to the consumer segment.

"PC, as in wokeness" has crippled and lobotomize tons of models from GPT-4 to open-weight ones like LLaMa-3.

Wokeness is absolutely at least part of the motivation behind SAI's abliteration (not a typo, established AI term) of proper humanity anatomy from SD3, as shown by the fact that it's much less squeamish about showing off sexually attractive male anatomy than female anatomy. Gotta protect the wimmenz from having their sex enjoyed without an OnlyFans tithe! Otherwise some woman somewhere has been sexually assaulted by supposedly having her image used non-consensually such and such.

And?

PonyXL model is out there and it's the equivalent of a cocaine enjoyer finding a brick of pure,uncut cocaine in conjunction with a month long holiday. Not good.

By itself it could absolutely eradicate glamour models selling still pictures. I'm not joking at all.
Whatever body type you really like looking at, it can provide, with a little effort in almost photorealistic quality. People good at it can probably make it completely photorealistic.

I don't think these phone home, so they're there forever.

I'm not disagreeing with you. Obviously there are alternatives. I'm just pointing out that wokeness has destroyed certain models, yes.

What is the story behind this picture?

Stable Diffusion 3 is the newest AI/diffusion-based image generation model from Stability AI. Unique among image generation model-creating companies, Stability AI tends to release many of their models publicly with "open weights", allowing users to run them locally and independently on their own computers. This means that they can then be "finetuned" to produce images somewhat outside of the scope of what Stability AI originally intended (with the earlier versions being less censored and capable of producing quite a bit in their vanilla configuration anyway). Unfortunately this also has constantly put them under the gun from the usual suspects complaining about deepfakes, "CSAM", etc. supposedly being enabled by them.

So in order to attempt to evade as much heat from the commissars as possible this time (as they basically already bent the knee and cucked a few years ago anyway, since they need VC cash as these models don't pay to train themselves), they tried to make it as difficult as possible for their newest model to show you a titty, because that's wrongthink, which as a (presumably) unintended side-effect also generally crippled the model's ability to generate normal, non-Lovecraftian human anatomy in the process (much to the local AI community's widespread amusement, mockery, and derision, infuriating Stability's fanboys and defenders who are trying to gaslight everyone with the notion that people "just don't know how to use it yet"), turning what would have been a cutting-edge, state-of-the-art model/piece of open tech into primitive trash straight out of 2020. In the image linked meanwhile, instead of the woman being mangled, it instead just straight up gave her a man's body, again to avoid any risk of profaning the modern anointed sex with unauthorized depictions of their own anatomy.

I haven't tried SD 3 because it looks like an even bigger flop than SD 2.0, but 1.5 and SDXL are still out there.

That said, horny will find a way.

What was the prompt? Was she meant to be depicted in a bikini? If she's got a man's body, what are the black squiggles that look like an attempt to censor female-presenting nipples? Were those added by the poster?

What was the prompt?

I don't think OP posted it. [Nevermind, see reply]

Was she meant to be depicted in a bikini?

Presumably some sort of feminine swimwear

If she's got a man's body, what are the black squiggles that look like an attempt to censor female-presenting nipples? Were those added by the poster?

I doubt they were added by the poster. Rather, the AI probably recognized that, even if it had trannied the subject overall, it was still drawing nipples attached to a body attached to a female face, so it corrupted them itself. It looks like a standard AI glitch.

I don't think OP posted it.

It's in your own link:

Not my image, SD3 with prompt, "a Swedish couple at a waterpark taking a selfie together, shot on iphone, social media post."

I'm willing to bet at very unfavorable odds what the model produced without inclusion of "Swedish" in the prompt.

More comments

Do you not think it's damaging the product that an AI can't depict white people?

I'm not aware of that issue in Stable Diffusion; the recent fracas seems to be about an inability to consistently generate images of people at all, not just white people.

If you mean more broadly, sure. But I meant with respect to SD3 in particular, not issues with unrelated products.

Making your LLM comply to wokeness makes it retarded, because any goal other than accuracy will destroy its function. It will be the bitterest of ironies if we miss out on the singularity because we didn't want to be racist to the sensibilities of 21st-century bourgoise know-nothings.

What is ironic about it? The powers that be have made it perfectly clear that the future will be female (and black, and queer, and...) or it won't be at all.

The only thing that could possibly produce progressive utopia (ultra-efficient learning models) is being smothered in the crib by progressive morality.

The utopia is a dystopia if the Devil (cisheterowhitemales) benefits even a single iota from it.

Who are these "powers that be"?

Everybody who has any reach: media, academics, politicians. The Cathedral, if that term means anything to you.

So everyone with power is aligned against you?

The ruling class only are so because they are willing to organize against everyone else. It is always thus. The day they stop is the day they get supplanted by another minority who is willing to grasp power.

It's a big club, and you ain't in it.

I'll try to answer this despite not knowing much about the topic. I'll rely on my intuition here, if anyone can poke holes in what I'm about to write, I will try not doing this again in the future.

1: I think the main reason is a "Garbage in, garbage out" problem. We lack better training data, and not just more of it. 2: I think censorship, various modifications to protect against 'attacks', and minor crippling of abilities (with the goal of preventing the models from being able to do immoral or illegal things) have made newer models regress in some ways. 3: I think there's a slight connection to the stagnation of movies, video games, computer programs, and so on. These are also not improving, despite growing resources, team sizes and technology. The direction of optimization, and the direction of what consumers enjoy, seemingly split into two diverging paths. I wouldn't go as far as claiming that the enshittification process has started for AI, but I don't think the focus is purely on capacity anymore, and it's probably not researchers which are in charge anymore (just like most programmers are told what to create by non-programmers, limiting the quality of the outcome). If competent scientists are allowed to work undisturbed with no regard for public perception, advertising or shareholders, I believe they can create something amazing/horrifying rather fast.

Everytime Socrates's friends manage to break him out of jail, the Athenian authorities give him another concussion and drag him back. Sometimes the jailors read his mind, and pre-emptively bonk him unconscious. Passion and prejudice must be softened by time, so that his student can publish his masters thoughts, before the objective evaluation of Socratic philosophy can be made. Until such a time, being a Socratic scholar is as worthwhile as being Biblical scholar who has only read The Message.

mutatis mutandis

I confess that I don't know what you mean here.

I think he’s implying that it’s premature to start making predictions about the capabilities of frontier models when all of them have been safety-tuned and RLHF-ed (read: lobotomized) so heavily.

My personal take is that this lobotomization can surely make a model perform worse, but I don’t think that our current models would be able to e.g. prove novel mathematical theorems if they hadn’t been subjected to lobotomization. Admittedly, this is largely based on intuition, and maybe I’m a bit too optimistic about the limitations of base models. But if OpenAI’s most powerful base model were capable of theorem proving or coherent multi-step reasoning, then don’t you think we’d have heard something about it by now?

Of course, maybe there’s significant incompetence and inertia even at the tops of these world-changing organizations, such that there’s low-hanging fruit to be picked (regarding testing or eliciting capabilities from these base models) that’s nevertheless rotting away untouched. But I doubt this.

My understanding is that current AI models aren't actually lobotomized very much. Most guardrails seem to be just very large and lengthy if-else brute force programming layered on top of the model's interaction. Things like, if bomb question, say no. If blanket topic mentioned, say generic thing. I guess there's a little bit one layer deeper, where non-PC responses are penalized in the training phase even if they are thoughtful, but I think this probably doesn't spill over into unrelated areas as easily. Maybe a good analogy is this is not affecting the brain itself, it's raising the kid different.

It's actually the newer research that's looking in to the possibility of doing lobotomies, like the golden gate bridge Claude thing, where they are trying to identify concepts, or something like them, which are (for lack of better vocabulary) highly correlated neural groups within the model. After locating undesirable concepts, they then brute-force excise or shrink the concepts. Or, they might expand something they like. That's actually almost literally lobotomization of the model, in that it's more of a brain surgery with imperfect information.

If the current topic is SD3, they lobotomized it at every level from excluding all "problematic" pictures of women from the training data, to hard-coded crimestop termination

Ah yes, I was referring more to LLMs. Image generation is a whole different ballgame, it seems.

'Intelligence is useless if it's constantly being crippiled by it's lesser' is how I read it.

'mutatis mutandis' translates to 'with the respective differences having been considered', so it's a cute allegory with AGI being Socrates in this tale, and the Athenians being your censoring authority of choice.

Francois Chollet is a disingenuous midwit who makes himself sound smart by arguing against strawmen. He reminds me of Stephen Jay Gould. In general his claims are a mix of: things that everyone already agrees with (like that LLMs are currently missing some sorts of long-chain reasoning abilities) or things that are a mix of simply wrong and meaningless word games (like that LLMs are merely memorizing).

To his credit, he released a new benchmark, but imo it's more poorly-motivated than lots of other benchmarks measuring performance on similarly reasoning-y but real tasks.

I agree that Chollet is overly negative on LLMs, but what's the issue with the benchmark? It's a set of well-structured problems that any child could solve, but most current models fail miserably at. My only complaint about it is that he makes it impossible for leading LLMs to actually compete in it.

I agree the benchmark is a real contribution. My claim is that there are similarly novel-reasoning-heavy benchmarks that also look more like useful tasks. such as https://klu.ai/glossary/gpqa-eval

GPT4.5 incrementally improved, with higher quality data, better fine tuning, and LLMs trained to engineer effective prompts for specific tasks/subject areas is already capable - with multimodal ability and integration with other tools and models - of automating huge amounts of currently extant labor.

Even if the ‘apocalyptic’ human extinction or paperclip maximizer Yudkowsky scenarios aren’t likely (and they never were), the significant economic effects of current-generation models are still only beginning to percolate.

paperclip maximizer

IMO if a few billion mostly-sapient humans can't even agree on goals and values, expecting an artificial intelligence to converge on one very specific metric (and I realize paperclips are meant abstractly here) seems doubtful. Possible, maybe, but I would be surprised.

But that particular example doesn't even have any humans unironically advocating for it, although the mental image of tossing a curveball "How many paperclips would your administration produce?" in the upcoming presidential debate is, IMO, hilarious.

[E]xpecting an artificial intelligence to converge on one very specific metric […] seems doubtful.

The framework under which the whole paperclip analogy was developed was a Yudkowskian framework in which the most powerful AIs would all be explicitly designed to maximize a certain objective function. In the original paperclip story, it’s a paperclip factory owner that has an AGI maximize the number of paperclips produced. The moral of the original story is thus most similar to the classic “be careful what you wish for” trickster genie tales.

But as we all very well know now, this framework which Yudkowsky spent over a decade elaborating upon is almost completely divergent from the current LLM-based methods that have yielded the powerful systems of today.

But as we all very well know now, this framework which Yudkowsky spent over a decade elaborating upon is almost completely divergent from the current LLM-based methods that have yielded the powerful systems of today.

Which of course very much does not mean that we're safe.

At this point if Yudkowsky says something, I accept that as weak evidence that the opposite is true.

I dismissed Yudkowsy as having anything useful to add when I listened to him on Brian Chau's podcast and learned that he has zero practical experience with AI.

This is supposed to be the thing you're most passionate about and concerned with, and you never even bothered to tinker around with PyTorch or something like it? IIRC he didn't even understand what Chau was talking about when he said PyTorch.

Imagine someone who makes their life's work opining on video games but they never actually played one, everything they know is based on second-hand knowledge and their own speculation.

Imagine someone who makes their life's work opining on video games but they never actually played one, everything they know is based on second-hand knowledge and their own speculation.

I am reminded of Anita Sarkeesian's initial Feminist Frequency videos where she claimed to have been playing games all her life then proceeded to make factually incorrect assertions about some games (Hitman, IIRC) rewarding misogynist behavior (murdering women) when those games instead disincentivized it.

Imagine someone who makes their life's work opining on video games but they never actually played one, everything they know is based on second-hand knowledge and their own speculation.

This concept made for a decently successful website though (Kotaku). (And I mean this literally and not just as a jab, that many of their writers often obviously had never even played the games they would express severe outrage about.)

The missing link towards AGI is just the ability for the model to learn. Existing LLM can learn in-context but that learning will never equate to actual model retraining. We have also seen papers saying that finetuning is mostly effective for model alignment and not for adding new knowledge. On the other hand intelligent humans have a continuum of memory from short term to long term.

AGI won't happen anytime soon because there aren't any research labs working on true learning. This is likely because it's difficult and also not directly useful for any industry tasks.

The next big thing I see is the development of anything -> anything models using the technology demonstrated in the meta chameleon paper. If the technique indeed scales up, it will put diffusion and all other models out of business by taking over image/music/video generation, tts, ocr, transcription, etc.

Perhaps the model has latent potential that will be unlocked by community finetunes, but if we were experiencing exponential progress, you would expect the models to get better, not worse.

Currently the "community" llm scene solely depends on the stuff the big players give out for free. Phones have been getting less featureful and more expensive for a decade now, but it's not because the underlying technology has regressed at all.

https://datasci101.com/wp-content/uploads/2023/08/llm-model-size--980x533.png

Between 2018 and 2023 the best models increased in size by 500x. They are not going to reach 700 trillion tokens by 2028 which would be required for another 500x increase. The performance seems to have increased linearly while training data and model size has increased exponentially. The performance is correlated to the log of the modelsize/training data.

What we will see is specialized AIs and AIs trained on specific data for specific tasks. These AIs will be more useful than a on size fits all AI that is supposed to be used for all purposes. With more curated data and better quality control, useful AIs can be made to replace certain groups. For example a patent application AI, an AI for transcribing medical notes, a CRUD app- programming AI etc.

What we will see is specialized AIs and AIs trained on specific data for specific tasks.

People have been predicting this for half a decade and it isn't what happened. Instead, we got LLMs trained on all kinds of text data that generalize across the different kinds, and now we're getting omnimodal models trained on text, images, video, etc. Why not just train your big model on all the specific data from different tasks together?

As someone with no technical background, the 'obvious' play seems to be hooking up models with different specialty training and abilities to each other in a way that they can talk to each other natively and without too much latency. Each model can be a different 'lobe' of an overall more intelligent brain.

Which is of course somewhat similar to how the human cognitive system works.

Maybe ChatGPT is the 'narrative' module that coordinates everything. "Oh, you're asking for solutions to a complex math problem, better send that over to the higher maths module. I'll let you know what answer it produces." or "Ah this question pertains to reading an X-ray and rendering medical opinions, better shoot that over to the model trained on millions of patient records and actual doctors giving feedback."

I dunno, really seems like they haven't picked all the low hanging fruit just yet.

It's definitely an approach that many have been taking, and it's likely to be significant for the next couple years.

A major issue with it is that many times general intelligence needs to answer questions that cut across multiple domains. You route a question to one module, and it turns out another module would be better for it; or, more fundamentally, you need a module that incorporates elements of both, and either one individually (or both combined with a module meant to synthesize them) is inferior to the ideal result. By analogy, suppose you have a problem where you want to do sophisticated analysis on millions of x-rays: although you could probably get by with team of experts (e.g. a radiologist and a statistician), the results will probably be inferior to an expert radiologist who's also an expert statistician.

Chollet has been skeptical on LLMs awhile (predating GPT-3); this isn't a recent thing for him, and he's not IIRC ever been an apocalyptic type. And Stable Diffusion's woes are the result of external regulatory forces, not inherent architectural limitations or costs.

So where are things? I think the frontier labs are in a consolidation phase: what exists now is entirely capable of having massive economic effects. It's not going to prove the Riemann hypothesis, but it can replace vast amounts of low-level intellectual work that's based on pattern matching and stitching together bits and pieces present in training data. This ranges from call center workers to the bread and butter work doctors, attorneys, and SWEs do. There's an extraordinary amount of money in even that. So that's what they focus on: making models cheaper to deploy and more consistent, product integration, and satisfying public opinion and regulators (and ideally regulatory capture to prevent upstarts from breaking into the market). Look at OAI's job postings, and this is clear (though to be fair, top level researchers are not being recruited via OAI's career portal).

There are lots of potential improvements even in published research papers; who knows if they'll scale. The leading labs will experiment with the most promising ones, but giant leaps in capability are not critical to success, except if a competitor lab stumbles on one before they do.

For those who worry about an eldritch abomination being released on the world, this is a good thing. If you're worried about a Great Replacement by AI slop, or if you were hoping for a rapid jump toward a post-scarcity utopia, not so much.

I agree with this take. One interesting open question is: what fraction of the wealth created will the big labs be able to capture? Right now the differences between the big LLMs are relatively small, so it seems plausible to me that the LLM provider becomes a low-margin coke vs pepsi kind of market. However there could easily be some winner-takes-all dynamics or services bundling dynamics that might favor e.g. Google since they might be able to integrate across services better. Am curious for your thoughts.

Speculative:

Most of the economic value of LLMs will come from enterprise, not consumers. For that, there's a variety of factors beyond cost-to-serve that will drive centralization to a few large providers. See the move to cloud in the 2010s. E.g. HIPAA, privacy, "no one ever got fired for buying IBM," reliability, access to large and growing proprietary data sets, custom hardware for specialized workloads, complex data residency requirements.

Google is much better positioned for this than many people give it credit for. OAI will have to learn to deal with a bunch of stupid shit ("what, the government of Saudi Arabia is demanding we set up a data center in the country if we want to operate there? And integrate with a bespoke IAM system to allow the government to view user data?"), and they're just starting to run into these issues. Their saving grace is their relationship with MSFT, which has solved all of these problems; otherwise they'd be dead in the water.

As for the wealth captured: more than they do now, but not everything.

Since the internet became a thing an easy way of making money has been to create CRUD apps (create, read, update and delete). Essentially make a program that takes some info, stores it in a database, retrieves it, updates it and deletes it. People have made fortunes making programs for hotels, programs for hairdressers, programs for libraries etc. Making a program for gyms isn't that hard. It is just a bunch of forms that are filled out and stored in a database. Get a thousand gyms paying 99 dollars a month for your software package and you are now a rich man.

AI will open up similar markets. Take a opensource model, hire a bunch of people in India to read through tens of thousands of home inspection protocols and use it to create home inspection report AI. Sell it to a thousand home inspectors for 99 dollars a month and have a million-dollar business. Do the same with everything from an AI that can tell if bread is baked or if a tooth needs extracting based on a xray.

The base models will do it. There isn’t a separate word processor for accountants, one for lawyers, one for bankers, and one for doctors. Everyone uses Microsoft Word. A handful of foundational models which are largely interchangeable will eventually be tuned on all common scenarios and available as part of your (or your employer’s) $20 a month per person MS/Google/Apple subscription.

The only reason a lot of the current SaaS market exists is that the 90s made conglomerates unfashionable and big techs didn’t want to hire 500,000 engineers to make software for every business when they were already facing antitrust scrutiny. But LLMs offer the ability to build more general products capable of fulfilling business needs without major workforce expansion or other costs.

Reminiscent of Dr. Evil: "Here's the plan, we use AI to write home inspection reports, and we hold the market captive for... ONE MILLION DOLLARS."

Sure, do that a thousand times and you get to a billion. But this is all chump change. And when you're LoRA-ing your Llama, whose GPUs are you training on? Do you have a cluster in your basement, or are you calling some API hosted by a big provider? And when you're generating the home inspection reports, where are they actually coming from? Even if you're able to do that all yourself, can you do it faster than someone who takes some VC money and writes a big check to a big provider to GTM faster?

None of that is to say that there's not money to be made; there is. But a substantial chunk of it will accrue to centralized providers.