@DaseindustriesLtd comments on "Culture War Roundup for the week of February 17, 2025

Culture War Roundup for the week of February 17, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

DaseindustriesLtd late version of a small language model 1mo ago

Once again I notice that I am usually right in being rude to people, as their responses demonstrate precisely the character flaws inferred. This is a low-content post in defense of wounded ego, with snappy Marvel-esque one-liners («Won a Nobel prize») and dunks optimized for the audience but next to no interest in engagement on the object level. Yes, ML != LLMs, so what? Are you not talking about Altman and Elon who both clearly put their chips on LLMs? «That was a joke», yeah I get your jokes. Here's one you missed:

data engineering is needed and important! But it's not revolutionary. Data engineering, is the same as its always been.

It's not the same, though, that's the thing. Returning back to my point that has upset you –

Fetishizing algorithmic design is, I think, a sign of mediocre understanding of ML, being enthralled by cleverness. Data engineering carves more interesting structure into weighs.

– I meant concretely that this is why leading companies now prioritize creation of training signal sources, that is: datasets themselves (filtered web corpora, enriched and paraphrased data, purely synthetic data, even entirely non-lingual data with properties that induce interesting behaviors), curricula of datasets, model merging and distillation methods, training environments and reward shaping – over basic architecture research, in terms of non-compute spend and researcher hours; under the (rational, I believe) assumption that this has higher ROI for the ultimate goal of reaching "AGI", and that its fruit will be readily applicable to whatever future algorithmic progress may yield. This goes far beyond ScaleAI's efforts in harnessing Pinoy Intelligence to annotate samples for RLHF and you have not even bothered to address any of this. If you think names of Old Titans are a valid argument, I present Hutter as someone who Gets It, gets that what you have a sufficiently general architecture to approximate is at our stage more interesting in terms of eventual structure than how you achieve this potential generality.

This older paper is a neat illustration too. Sohl-Dickstein and Metz have done a little bit of work in non-LLM algo design if you recall, maybe you'll recognize at least them as half-decent scientists.

Now, as regards poor taste in intellectual disagreements, let's revisit this:

Regardless of whether transformers are a dead-end or not, the current approach isn't doing new science or algo design. Its throwing more and more compute at the problem and then doing the Deepseek approach of finetuning the assembly level gpu instructions to exploit the compute even better so you can throw more compute at it. I doubt, Hinton, Goodfellow, LeCunn, Schimdhubber et al. have any desire to do that. Maybe if xAI did something revolutionary like leave the LLM space or introduce a non-MoE-Transformer model for AGI, then talent of that caliber might want to work there. Currently they exist so Elon can piss all over Altman.

My rudeness was not unprovoked; it was informed by the bolded parts. I saw it as a hubristic, elitist, oblivious, tone-deaf insult towards people – scientists – actually moving the field forward today, rather than 8 or 28 years ago, and I do not care that it's slightly obfuscated or that you lack self-awareness to recognize the douche in the mirror but are eager to chimp out at it as you currently do.

Likewise my entire point, before you jumped into insult me, is that the Big Names in ML/AI are "fetishy algo freaks" They shockingly don't want to do non "mediocre algo butt sniffing" work. And Data Engineering isn't new, it isn't revolutionary, it's great, it works well, but it doesn't require some 1% ML researcher to pull it off. It requires a solid engineering team, some technical know-how, and a willingness to get your hands dirty. But no one is going to get famous doing it. It's an engineering task not a research task. And since research tasks are what people pay the ludicrously big bucks for at tech companies the engineers at xAI aren't being paid some massive king-sized salary...

yes thanks for clarification, that's exactly as I understood you.

I claim that to the extent that «talent of that caliber» shares your conceit that design of clever new algorithmic primitives for ANNs is «exciting new science» whereas data work remains and will remain ScaleAI tier «mere data engineering, same as always», this talent is behind the times, too set in their ways, and is resting on its laurels; indeed this is the same high-level philosophical error or prizing manual structure design over simplicity, generality and scalability that keeps repeating on every revolution in AI, and that Sutton has famously exposed. They are free to work on whatever excites them, publish cute papers for fellow affocionados where they beat untuned mainstream baselines, or just leave the frontlines altogether, and even loudly assert that they have superior taste if they so choose, which in my view is just irrational fetishism plus inflamed ego; I think taste is to be calibrated to actual promise of directions. But indeed, what do I know. You are free to share their presumptions. New scientific talent will figure it out.

Seems like you are asking us to praise ignorance over discovery?

To me it seems like the opposite, we just disagree on what qualifies as discovery or science at all, due to differences in taste.

As an exercise, can you tell me THE engineer at Deepseek who proposed or wrote their Parallel Thread Execution(PTX) code with a citation?

Egoists gonna be egoists.

Zhean Xu probably. But I think everyone on (Chenggang Zhao and Shangyan Zhou and Liyue Zhang and Chengqi Deng and Zhean Xu and Yuxuan Liu and Kuai Yu and Jiashi Li and Liang Zhao) list could ask for a megabuck total comp in a frontier lab now, and expect affirmative response.

-1

Context

YoungAchamian DaseindustriesLtd 29d ago · Edited 28d ago

(Edit) After some thought, I decided to tone done my dismissive vitriol and maybe offer a more constructive response.

Despite what you might think I don't have unlimited free time/brain power to engage in high-effort debate with random people online, I'm a shape-rotator, not a word-cell. Particularly since debating people online rarely leads to any information exchange or substantive opinion change. As such I apply a heuristic when having a discussion online on whether my interlocutor is worth it. Needless antagonism, unfounded arrogance, pithy insults and pettiness are the typical markers that its not. People who don't engage charitably and treat discussion as some sort of mal-social debate team competition, where anything goes, doubly so.

Dase you tripped up all of the above. To my chagrin, I snapped back which was unbefitting of my expectations for myself. If you want people to engage with you substantively, with high information density conversation, you have to give them a reason to put the effort in. If you write only for extreme heat with unproportionate amounts of light then no one reasonable is going to engage with you. Maybe that is to your taste, who am I to judge pigs that want to roll in the mud. Regardless I have better uses of my time than getting into the stie with you.

Food for thought: ML != LLMs, if your comment here:

was changed to this:

Fetishizing algorithmic design is, I think, a sign of mediocre understanding of LLMs, being enthralled by cleverness. Data engineering carves more interesting structure into weighs.

Then it is a far more applicable to the evidence you have provided and honestly I think the topic you actually care about. I might even agree, however the original doesn't align with the reality of ML as a field across ALL domains. But who knows, maybe my attempt at being charitable here will go nowhere, you'll double down on being an ass, and I'll update my weights with finality on the pointlessness of engaging with you in the future.

Have a good one.

DaseindustriesLtd late version of a small language model YoungAchamian 27d ago

That's fine, I don't feel entitled to your time at all. I also can't predict what might trigger you, just like you cannot predict what would trigger me, nor does it seems like you would care.

The discussion was originally about labs overwhelmingly focused on LLMs and competing for top talent in all of ML industry so partially that was just me speaking loosely.

I do in fact agree with heads of those labs and most star researchers they've got that LLMs strikingly similar to what was found in 201 7 will suffice for the shortest, even if not the globally optimal route to “AGI” (it's an economic concept now anyway, apparently). But it is fair that in terms of basic research there are bigger, greener pastures of intellectual inquiry, and who knows - maybe we will even find something more general and scalable than a well-designed Transformer there. Then again, my view is that taste is to be calibrated to the best current estimate of the promise of available directions, and in conjunction with the above this leads me to a strong opinion on people who dismiss work around Transformers, chiefly work on training signal sources that I've covered above, as “not new science”. Fuck it, it is science, even if a bit of a different discipline. You don't own the concept, what is this infuriatingly infantile dick-measuring?

It's not so much that I hold non-LLM, non-Transformer-centric algo design work in contempt as I am irritated by their own smug, egocentric condescension towards what I see as the royal road. Contrarianism, especially defensive contrarianism, is often obnoxious.

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats