This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
From 5 hours ago: A complex problem that took microbiologists a decade to get to the bottom of has been solved in just two days by a new artificial intelligence (AI) tool.
Slowly it's becoming clear that ASI is already with us. Imagine if you handed someone from 100 years ago a smartphone or modern networking technology. Even after explaining how it worked, it would take them some time to figure out what to do with it. It took a long time after we invented wheels to figure out what to do with them, for example.
The technology to automate 80-90% of white collar labor already exists, for example, with current-generation LLMs. It's just about interfaces and layers and regulation and safeguards now. All very important, of course, but it's not fundamental technical challenge.
I'm skeptical this is actually how it went down. Why would it take 2 days to come up with the hypotheses? I'm not aware of any LLM that thinks that long, which to me implies the scientist was working with it throughout and probably asking leading questions.
It looks like co-scientist is one of the new "tree searching agent" models: you give it a problem and it will spin off other LLMs to look into different aspects of the problem, then prune different decision trees and go further with subsequent spinoff LLMs based on what those initial report back, recursing until the original problem is solved. This is the strategy that was used by OpenAI in their "high-compute o3" model to rank #175 vs humans on Codeforce (competitive coding problems), pass the GPQA (Google-proof Graduate-level Q&A), and score 88% on ARC-AGI (vs. Human STEM graduate's 100%). The recursive thought process is expensive: the previous link cites a compute cost of $1000 to $2000 per problem for high-compute o3, so these are systems that compute on each problem for much longer than the 35 seconds available to consumer ($20/month) users of o1.
Thanks, that's good information. Still, I don't believe it would actually take two days straight to work through the problem, which indicates follow-up questions etc.
Doesn't sound like it.
Possible if running on Google servers that the request is somehow queued up or prepared at least in the testing phase which they appeared to have been invited to.
The thought occurred to me as well. Between 'no one else has our data!' and 'we didn't realize someone else might have our data,' I tend to default to the later.
That's not relevant to what 2rafa said. Her point is that coscientist may have taken two days to serve earlier queries and only then gotten to this query.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I came of age right as the Internet was taking off. But I've started watching classic movies and TV and I think the "information at my fingertips" effect is something that has happened so gradually I don't think we really appreciate it's impact fully, even pre-LLM. One recent TV episode from the '90s had one character tell another to travel to the state capital and find and photocopy dead-tree legal references, which was expected to take a day. My world today is radically different in a number of ways:
State laws are pretty easily accessible via the internet. I'm not sure how the minutia of laws were well-known back then. Are our laws themselves different (or enforced differently) because the lay public can be expected to review, say, health code requirements for a restaurant?
Computerized text is much more readily searchable. If I have a very specific question, I can find key words with ctrl-f rather than depending on a precompiled index. The amount of information I need to keep in my brain is no longer things like exact quotes, just enough to find the important bits back quickly. The computer already put a bunch of white-collar workers out of jobs, just gradually: nobody needs an army of accountants with calculators to crunch quarterly reports. Or humans employed to manually compute solutions to math problems.
The Internet is now readily accessible on-the-go. Pre-iPhone (or maybe Blackberry), Internet resources required (I remember this) finding a computer to access. So the Internet couldn't easily settle arguments in real conversation. The vibe is different, and at least in my circles, it seems like the expectation of precision in claims is much higher. IRL political arguments didn't go straight to citing specific claims quite the same way.
I sometimes feel overwhelmed trying to grasp the scope of the changes even within my own lifetime, and I find myself wondering things like what my grandfather did day-to-day as an engineer. These days it's mostly a desk job for me, but I don't even know what I'd be expected to do if you took away my computer: it'd be such a different world.
Maybe I'm misunderstanding the question, but laws are organized into books that are indexed. You look up the relevant statute, search for the right section, and then read a few paragraphs describing the law. If you need to know the details of case lawyer, you consult a lawyer. They go to law school and read relevant cases to know how judges are likely to rule on similar future cases.
You still need lawyers to do this because ctrl-f doesn't return a list of all the relevant legal principles from all the relevant cases.
There also has been a massive explosion in the number and complexity of laws since the word processor was invented.
This is, I think, the answer I was looking for. Ctrl-F doesn't find everything (I've had to search non-indexed dead-tree books before), but it's a huge force multiplier.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
To me, this is impressive, but not that impressive: sure it answered the question, but it didn't pose the question. In the same way, LLMs are decent at writing code, but have ~no ability to decide what to write. You can't just point them at your codebase and a bunch of email threads from PMs and hope it writes the right thing.
I don't know how many plausible hypotheses there are for the question it solved, or how hard it is to generate them, but it's surely much easier than looking at the state of the field as a whole and coming up with a new idea for which to generate hypothesis.
AI is a junior engineer.
It's actually far worse than that. LLMs are a junior engineer who cannot learn. The reason that we put up with junior engineers and invest effort into training them is because they will learn, stop making junior mistakes, and someday be productive enough that they pay off the effort of training them. But you can't do that with an LLM. Even after 2-3 years of development, they still suck at writing code. They still hallucinate things because they have no understanding of what they are doing. They still can't take on your feedback and improve for the next time you ask a question.
If LLMs were as capable as a junior engineer, that wouldn't be all bad. But they're actually less capable. Of course people aren't impressed.
I agree completely.
I see you've met my coworker.
More options
Context Copy link
Hoo boy. Speaking as an programmer who uses LLMs regularly to help with his work, you're very, VERY wrong about that. Maybe you should go tell Google that the 20% of their new code that is written by AI is all garbage. The code modern LLMs generate is typically well-commented, well-reasoned, and well-tested, because LLMs don't take the same lazy shortcuts that humans do. It's not perfect, of course, and not quite as elegant as an experienced programmer can manage, but that's not the standard we're measuring by. You should see the code that "junior engineers" often get away with...
I use AI a lot at work. There is a huge difference between writing short bits of code that you can test or read over and see how it works and completing a task with a moderate level of complexity or where you need to give it more than a few rounds of feedback and corrections. I cannot get an AI to do a whole project for me. I can get it to do a small easy task where I can check its work. This is great when it's something like a very simple algorithm that I can explain in detail but it's in a language I don't know very well. It's also useful for explaining simple ideas that I'm not familiar with and would have to look up and spend a lot of time finding good sources for. It is unusable for anything much more difficult than that.
The main problem is that it is really bad at developing accurate complex abstract models for things. It's like it has memorized a million heuristics, which works great for common or simple problems, but it means it has no understanding of something abstract, with a moderate level of complexity, that is not similar to something it has seen many times before.
The other thing it is really bad at is trudging along and trying and trying to get something right that it cannot initially do. I can assign a task to a low-level employee even if he doesn't know the answer and he has a good chance of figuring it out after some time. If an AI can't get something right away, it is almost always incapable of recognizing that it's doing something wrong and employing problem solving skills to figure out a solution. It will just get stuck and start blindly trying things that are obviously dead-ends. It also needs to be continuously pointed in the right direction and if the conversation goes on too long, it keeps forgetting things that were already explained to it. If more than a few rounds of this go on, all hope of it figuring out the right solution is lost.
Thanks, it's clear that (unlike the previous poster, who seems stuck in 2023) you have actual experience. I agree with most of this. I think there are people working on giving LLMs some sort of short-term memory for abstract thought, and also on making them more agentic so they can work on a long-form task without going off the rails. But the tools I have access to definitely aren't there yet.
So, yeah, I admit it's a bit of an exaggeration to say that you can swap a junior employee's role out with an LLM. o3 (or Claude-3.5 Sonnet, which I haven't tried, but which does quite well on the objective SWE-bench metric) is almost certainly better at writing small bits of good working code - people just don't understand how horrifically bad most humans are at programming, even CS graduates - but is lacking the introspection of a human to prevent it from doing dangerously stupid things sometimes. And neither is going to be able to manage a decently-sized project on their own.
More options
Context Copy link
More options
Context Copy link
I'm a programmer too, and I'm perfectly willing to tell Google that their 20% code is garbage. Honestly you shouldn't put them on a pedestal in this day and age, we are long past the point where they are nothing but top tier engineers doing groundbreaking work. They are just another tech company at this point, and they sometimes do stupid things just like every other tech company does.
If you are willing to accept use of a tool which gives you code that doesn't even work 10% of the time, let alone solve the problem, that's your prerogative. I say that such a tool sucks at writing code, and we can simply agree to disagree on that value judgement.
More options
Context Copy link
The vast majority of that "code being written by AI" at Google is painfully trivial stuff. We're not talking writing a new Paxos implementation or even a new service from scratch. It's more, autocomplete the rest of the line "for (int i = 0"
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
This is exactly @jeroboam's point - you say "AI is a junior engineer" as if that's some sort of insult, rather than unbelievably friggin' miraculous. In 2020, predicting "in 2025, AI will be able to code as well as a junior engineer" would have singled you out as a ridiculous sci-fi AI optimist. If we could only attach generators to the AI goalposts as they zoom into the distance, it would help pay for some of the training power costs... :)
It's weird and a surprise that current AI functions differently enough from us that it's gone superhuman in some ways and remains subhuman in others. We'd all thought that AGI would be unmistakable when it arrived, but the reality seems to be much fuzzier than that. Still, we're living in amazing times.
"We" should have read Wittgenstein to predict this. LLMs can speak, but we can't understand them.
People really need to get over anthropomorphism until we actually understand how humans work.
More options
Context Copy link
More options
Context Copy link
The question was “why are some bacteria resistant to antibiotics”, ie one of the most important questions in medicine.
On the one hand, wow, that's very, very impressive.
On the other hand, skepticism and my prior of "nothing ever happens and especially not with LLMs" makes me ask: was that literally the question? Do you have a source? I am very much not a biologist, but that is surprisingly/impressively broad.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link