@self_made_human comments on "Culture War Roundup for the week of March 31, 2025

Culture War Roundup for the week of March 31, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

self_made_human Kai su, teknon? 1d ago

Thank you.

What do you mean by "reference NIST"? I think I've already mentioned that despite its internal chain of thought claiming to reference NIST or "look up" sources, it's not actually doing that. It had no access to the internet. I bet that's an artifact of the way it was trained, and regardless, the COT, while useful, isn't a perfect rendition of inner cognition. When challenged, it apologizes for misleading the user, and says that it was a loose way of saying that it was wracking its brains and trying to find the answer in the enormous amount of latent knowledge it possesses.

I also find it very interesting that the model that couldn't use code to run its calculations got a very similar answer. It did an enormous amount of algebra and arithmetic, and there was every opportunity for hallucinations or errors to kick in.

Context

yunyun333 self_made_human 1d ago

For the first calculation dump at least, it comes up with a value 6.63 × 10⁸ s^-1, then compares it to the expected value from the NIST Atomic Spectra Database 1.6725 × 10⁸ s⁻¹, then spends half the page trying to reconcile the difference, before giving up and proceeding with the ASD value.

self_made_human Kai su, teknon? yunyun333 1d ago

Hmm. I think that's likely because my prompt heavily encouraged it to reason and calculate from first principles. It's a good thing that it noted that those attempts didn't align with pre-existing knowledge, and accurately recalled the relevant values, which must be a nigh-negligible amount of the training data.

At the end of the day, what matters is whether the model outputs the correct answer. It doesn't particularly matter to the end user if it came up with everything de-novo, remembered the correct answer, or looked it up. I'm not saying this can't matter at all, but if you asked me or 99.999% of the population to start off trying to answer this problem from memory, we'd be rather screwed.

Thanks for the suggestion and looking through the answer, I've personally run up to the limits of my own competence, and there are few things I can ask an LLM to do that I can't, while still verifying the answer myself.

FistLast2 self_made_human 1d ago

At the end of the day, that's not really what matters, because nobody is going to need to solve a problem in physics with a known solution. A good portion of tests that I had as an undergraduate and in graduate school were open book, because simply knowing a formula or being able to look up a particular value wasn't sufficient to be able to answer the problem. If I want a value from NIST, I can look it up. The important part is being able to correctly engage in the type of problem solving needed to answer questions that haven’t ever been answered before.

I've had some thoughts about what it actually means to be able to do "research level" physics, which I'm still convinced no LLM can actually do. I've thought about posing a question as a top level post, but I'm not really an active enough user of this forum to do that amd don't want to become one.

Finally, I want to say that for the past 18 months, I've continually been getting solicitations on LinkedIn to solve physics problems to teain LLM's. The rate they offer isn't close to enough to make it worth it for me, even if I had any interest, but it would probably seem great to a grad student. I wouldn't be surprised if these models have been trained on more specific problems than we realize.

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats