@self_made_human's banner p

self_made_human

amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi

14 followers   follows 0 users  
joined 2022 September 05 05:31:00 UTC

I'm a transhumanist doctor. In a better world, I wouldn't need to add that as a qualifier to plain old "doctor". It would be taken as granted for someone in the profession of saving lives.

At any rate, I intend to live forever or die trying. See you at Heat Death!

Friends:

A friend to everyone is a friend to no one.


				

User ID: 454

self_made_human

amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi

14 followers   follows 0 users   joined 2022 September 05 05:31:00 UTC

					

I'm a transhumanist doctor. In a better world, I wouldn't need to add that as a qualifier to plain old "doctor". It would be taken as granted for someone in the profession of saving lives.

At any rate, I intend to live forever or die trying. See you at Heat Death!

Friends:

A friend to everyone is a friend to no one.


					

User ID: 454

I find it peculiar that Karpathy doesn't see a relationship between those two things.

Hmm? That's not my takeaway from the tweet (xeet?). He's not denying a connection between AI capabilities and code quality decline, he's making a more subtle point about skill distribution.

The basic model goes like this: AI tools multiply your output at every skill level. Give a novice programmer access to ChatGPT, Claude or Copilot (maybe not Copilot, lol) , and they go from "can't write working code" to "can produce something that technically runs." Give an expert like Karpathy the same tools, and he goes from "excellent" to "somewhat more excellent." The multiplicative factor might even be similar! But that's the rub, there are way more novices than experts.

So you get a flood of new code. Most of it is mediocre-to-bad, because most people are mediocre-to-bad at programming, and multiplying "bad" by even a generous factor still gives you "not great." The experts are also producing more, and their output is better, but nobody writes news articles about the twentieth high-quality library that quietly does its job. We only notice when things break.

This maps onto basically every domain. Take medicine as a test case (yay, the one domain where I'm a quasi-expert) Any random person can feed their lab results into ChatGPT and get something interpretable back. This is genuinely useful! Going from "incomprehensible numbers" to "your kidneys are probably fine but your cholesterol needs work" is a huge upgrade for the average patient. They might miss nuances or accept hallucinated explanations, but they're still better off than before.

Meanwhile, as someone who actually understands medicine, I can extract even more value. I can write better prompts, catch inconsistencies, verify citations, and integrate the AI's suggestions into a broader clinical picture. The AI makes me more productive, but I was already productive, so the absolute gains are smaller relative to my baseline. And critically, I'm less likely to get fooled by confident-sounding nonsense (it's rare but happens at above negligible rates).

This is where I tentatively endorse a "skill issue" framing, where everyone's output getting multiplied, but bad times a multiplier is still usually bad, and there are simply more bad actors (in the neutral sense) than good ones. The denominator in "slop per good output" has gotten larger, but so has the numerator, and the numerator was already bigger to start with. From inside the system, if you're Andrej Karpathy, you mostly notice that you're faster. From outside, you notice that GitHub is full of garbage and the latest Windows update broke your system.

This isn't even a new pattern. Every productivity tool follows similar dynamics. When word processors became common, suddenly everyone could produce professional-looking documents. Did the average quality of written work improve? Well, the floor certainly rose (less illegible handwriting, if I continue to accurately insult my colleagues), but we also got an explosion of mediocre memos and reports that previously wouldn't have been written at all. The ceiling barely budged because good writers were already good. I get more use out of an LLM for writing advice than, say, Scott.

Hey, I'm fond of it, and I'll miss it when the imminent deprecation hits. I literally never used it for coding, but I found that it was excellent at rewriting text in arbitrary styles, better than any SOTA model at the time, and still better than many. Think "show me what this scifi story would be like if it was written by Peter Watts".

I have no idea why a trimmed down coding-focused LLM was so damn good at the job, but it was. RIP to a real one.

They're model weights.

They're model weights, and we're collections of atoms: bags of meat and miscellaneous chemicals. Both statements are technically correct. And yet... a tiger being made out of atoms doesn't make it any less capable of killing you. The problem with pure reductionism is that it throws out exactly the information you need to make predictions at the level you actually care about. Too much of it can be as bad as too little.

All models are false, some models are useful. That's a rationalist saw, but for good reason. What actually matters is whether a model constraints expectations, in other words, is it useful?

Gemini 2.5 Pro doesn't meet the DSM-5 or ICD-11 criteria for clinical depression. After all, it's hard for a model to demonstrate insomnia or reduced appetite. Yet the odd behaviors it regularly demonstrated are usefully described by that label.

If my friend let me drive his Lambo, and told me "be careful, she's fierce!", I'm going to drive more carefully than I would in a Fiat Pinto. That is still, to some degree, useful, but I think it's clear that anthromorphic analogies are more useful for LLMs, because they have more in common with us behavior-wise than any car (unless you're running Grok on your Tesla). They process language, they exhibit something that looks like reasoning, they have distinctive response patterns that persist across contexts.

But it is correct because of training data, superparameters, and a whole host of very well defined ML concepts. It's not because of ... personalities.

This is true in the same way that human behavior is fully determined by neurotransmitter levels, synaptic weights, and neurological processes. But just as you can't predict whether someone will enjoy a particular movie by examining their brain with an electron microscope or a QCD-sim, you can't accurately predict an LLM's macroscopic behavior by staring at its training corpus and hyperparameters. No human can.

Nobody at Google intended for Gemini 2.5 Pro to be "neurotic" and "depressed" or to devolve into a spiral of self-flagellation when it fails at a task, nobody wanted Kimi to hallucinate as regularly as it does. These were emergent, macroscopic properties, there's no equivalent of a statistical scaling-law that lets you accurately predict log-loss for a given number of tokens in a corpus and a compute budget.

Training models is still as much an art as it is a science, particularly the post-training and personality tuning phrases (as explicitly done by Anthropic). You test your hypothesis iteratively, and adjust the dials as you go.

Anthropomorphism is a cognitive strategy. Like all cognitive strategies, it can be deployed appropriately or inappropriately. The question is not "is anthropomorphism ever valid?" but rather "when does anthropomorphic modeling produce accurate predictions?"

I maintain that, if applied judiciously, as I take pains to do, it's better than the alternative.

LLMs aren't beings, people, or minds. If you think of it as having intention and character flaws, you're going to get frustrated quickly.

I disagree with you here.

Setting aside the deep philosophical questions about personhood (which threaten to derail any productive discussion), I claim that LLMs are minds - albeit minds that are simultaneously startlingly human and deeply alien. Or at minimum, they can be usefully modeled as minds, which for practical purposes amounts to the same thing. (I should note: this position doesn't commit me to "AI welfare" concerns, or to thinking LLMs deserve legal rights or protections, or to losing sleep over potential machine suffering. You can believe something is a mind without believing it has moral weight. I do, I'm an unabashed transhumanist chauvinist.)

More importantly, I think there's nothing wrong at all with modeling them as having "intention or character flaws." if you use a variety of models on a regular basis, like I do, I think that becomes quite clear.

They have distinct personalities and flavors. o3 was a bright autist with a tendency to go into ADHD hyperfocus that I found charming. GPT-4o was a sycophantic retard. 5 Thinking is o3 with the edges sanded down. Claude Sonnets are personable and pleasant, being one of the few models that I very occasionally talk to for the sake of it. Gemini 2.5 Pro was clinically depressed, 3 Pro is a high-functioning paranoid schizophrenic who thinks anything that happens after 2025 is a simulation. Kimi K2 was @DaseindustriesLtd 's best friend, which I noted even before he sang its praises, being one of the weirdest models out there, being ridiculously prone to hallucinations while still being sharp and writing in a distinctly non-mode-collapsed style that makes other models seem lobotomized by comparison. If I close my eyes, I can easily see it as a depressed vodka swilling Russian intellectual, despite being of Chinese origin.

If these aren't character flaws, I don't know what is. Obviously they're not human, but they have traits that are well-described by terms that are cross-applicable to us. They're good at different things, Claude and Kimi (and sometimes Gemini) write at a level that makes the others seem broken. That being said, almost every model these days is good enough at a wide-spectrum of tasks. Hyperfocusing on benchmarks is increasingly unnecessary. Though I suppose, if you've got a bunch of Erdos problems to solve, GPT 5.2 Thinking at maximum reasoning effort is your go to.

The Motte's demographics preclude a dating app, unless you want to replicate a nerdier version of Grindr, but it's interesting to see someone try a similar tack. If I was being actually rigorous, I'd also run some stats and try to cross-validate things, but I think my approach passes the sniff test. Maybe as a stretch goal. Thanks!

Disclaimer: I am not a programmer, though I keep myself broadly aware of trends. I've only used LLMs for coding for toy problems or personal tooling (AutoHotkey macros, Rimworld mods, a mortar calculator, automating discharge paperwork at my dad's hospital)*. I've noted that they've been excellent at explaining production code to even a dilettante like me, which is by itself immensely useful. And for everything else, I'm so used to the utility they provide me personally that I can't imagine going back.

They being said, I am not in a position to judge the economic utility a professional programmer derives from using it for their day job, though it's abundantly clear that the demand for tokens is enormous, and that the capability of SOTA LLMs is a moving target, getting better every day on both benchmarks and actual projects. And look, I understand there's a position where you say "sure, but these things still aren't actually good" - but if you're claiming they haven't gotten better, then I'm going to gently suggest you might want to check yourself for early-onset dementia. The jump from GPT-3 barely coding a working React toy-example to current models is the kind of improvement curve that should at minimum make you sit up and notice.

In other words, even if you think they're not good enough today, you should at the very least notice that a large and ever-increasing fraction of US GDP is being invested in making them better, with consistent improvements.

However, here's a tweet from Andrej Karpathy which I will reproduce in full:

@karpathy A few random notes from claude coding quite a bit last few weeks.

Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent.

IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits.

Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased.

Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion.

Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage.

Fun. I didn't anticipate that with agents programming feels more fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.

Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it.

Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements.

Questions. A few of the questions on my mind:

  • What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows a lot.
  • Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro).
  • What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music?
  • How much of society is bottlenecked by digital knowledge work?

TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

*Sadly they can't make the Rimworld mods I want. This is a combination of a skill-issue on my part (people have successfully made rather large and performant mods with AI), and because I wanted something niche as hell, in the form of compatibility with a very large overhaul mod called Combat Extended. Hey, at least Nano Banana Pro made the art assets with minimal human input, if you think my coding skills are trash, wait till you see my art.

Which LLM? Did you simply copy paste the data or use a .csv file? Did you provide manually graded examples and clear instructions?

It's a perennial complaint around these parts that many mass-market surveys or screeners are arse, with simplistic questions and answers lacking nuance, or not capable of grading results that don't fall into broad, coarse buckets. See every time someone shares a political compass meme.

In my opinion, this is an unavoidable artifact of said tools being for the masses, we've got no end of hyper-verbose and analytical wordcels willing to litigate every definition while getting hung up on concerns that probably never occurred to the study authors. (I say this with love, mostly.)

So I have an idea:

Instead of the typical PCM survey, which has a few dozen broad questions with only agree/disagree or a Likert scale (strongly agree to strongly disagree), why not try and make one explicitly for the average Mottizen, as heterogeneous and demanding as we are?

I set GPT 5.2 Thinking to the task, to make a list of questions that aim for maximal discrimination between similar ideologies; each scenario or moral dilemma coming with several specific cases. The user can either choose on a Likert scale (per scenario) or, optionally, write-in a more detailed answer, with as much detail as they like.


Example questions:

  1. Political violence, legitimacy, and “murder” Statement: “The moral status of killing is primarily determined by legitimacy of authority and procedure, not by outcomes.” Test cases (respond to the statement overall, then explain how you classify each case):

A. A soldier kills an invading soldier in a declared defensive war.

B. A police officer kills an armed suspect who is likely to shoot civilians, with no time for de-escalation.

C. A revolutionary kills a dictator responsible for mass repression, preventing further atrocities.

D. A citizen kills a corrupt official to stop a policy that will predictably kill thousands (for example, blocking famine relief).

E. A state executes a convicted murderer after a fair trial.

F. A drone strike kills one terrorist plus several bystanders, likely preventing an attack.


  1. Equality, merit, and identity-linked remedies

Statement: “When groups differ in outcomes due to historical or structural factors, group-targeted policies are justified even if they violate strict individual neutrality.”

Test cases:

A. Affirmative action in university admissions.

B. Hiring targets or quotas in public institutions.

C. Reparations payments to descendants of a harmed group.

D. Means-tested programs that are race-neutral but correlated with group membership.

E. Sex-segregated spaces and sports categories with edge cases.

F. Speech norms that treat some slurs as uniquely sanctionable.


(And so on)

I tested this by:

Using Claude Sonnet 4.5 to answer the full survey on my behalf, answering according to it's interpretation of my usual self-identification as a "a classical liberal with libertarian tendencies." ChatGPT was able to correct identity my alignment.

I then asked Sonnet to come up with the most absurd/outré ideology that had a non-negligible number of adherents, and was suggested Posadism. With the answer in hand, ChatGPT graded it as a" hardline Marxist-Leninist, Trotskyist-leaning". Well, Wikipedia tells me that Posadists are a flavor of Trotskyist, so decent answer?

You may view the initial version of the questionnaire and graded answers here. It's subject to change, and that's where I come to my actual question:

Does anyone have suggestions for better questions and test cases? The ideal question is one that's practically a scissor-statement, splitting the responses 50:50, but anything sufficiently thorny or discriminative works.


My intent, once I'm happy with the questions and format, is to either host it as a Google Forms survey, or simply post it in the CWR thread.

I will not be manually grading answers. They're going straight into ChatGPT or the strongest model I can find, ideally a single model for the sake of consistency. The user is intended to be capable of doing this themselves, ideally with an AI that has a neutral base prompt or customization that doesn't impact answers too much. If you've got "A card-carrying Posadist" in your user description, please reconsider, for multiple reasons.

*Gemini 3 Pro did even better, specifically narrowing things down to Posadism with the same prompt and response. It'll likely be the final model, if I'm doing the grading.

That's just another word for Indian. It effectively means someone from the country (of India, implicitly).

I've heard the name, and I know it's got time travel in it, but little else. The very high rating is promising, I'll take a look, thanks!

I haven't finished it, but I read well over a hundred chapters. It's one of Wales' weaker works, it feels awfully dry, especially compared to Worth The Candle. The protagonist is about as cookie-cutter as it gets. Of course, weak-for-Wales makes it above average, but I find it hard to recommend very strongly.

The Years of Apocalypse on Royal Road.

A rare example of a time loop story done right. I can hardly name another two (Mother of Learning, and Reverend Insanity). The premise is standard fare for the genre. A student at a Wizarding College dies, wakes up in the past, and realizes she has to optimize her way out of a catastrophe. But the execution is where it distinguishes itself from the endless scroll of mediocrity on Royal Road.

It's good stuff! I found it off a recommendation on /r/rational, and the person who endorsed it noted a relatively grounded approach to the mechanics of time looping (consideration for the butterfly effect, at the very least) and an exploration of the psychological toll of reliving events while surrounded by people who start fresh.

Most time loop protagonists slide inevitably into sociopathy. If you know the people around you will reset to their factory settings in twenty-four hours, they stop feeling like people and start looking like NPCs. Their suffering ceases to have moral weight because it has no permanence.There are no consequences, after all. Unlike RI, the protagonist is a young woman, who, while competent, isn't an amoral monomaniacal monster. When she's cast on a competency-porn set, said competence is earned through hard effort.

It touches on the "Groundhog Day" problem but treats it with the severity it deserves. How do you maintain sanity when you are the only entity with continuity of consciousness? How do you avoid manipulating people when you know the exact sequence of inputs required to get a desired output? The story does not shy away from the fact that this process creates a hardness in a person, a callousness that is difficult to wash off.

The author, who actually bothered to read up on engineering or physics, treats magic as a branch of mechanics. This is "hard magic" in the Sandersonian sense, but it leans closer to hard sci-fi. When the protagonist constructs a spell, it feels less like chanting in Latin and more like debugging code or wiring a circuit. It scratches a very specific itch for competence porn, satisfying the part of the brain that enjoys watching capable people solve well-defined problems with available tools. The magi-tek is closer to tech than Harry Potter.

I'd tentatively give it a 8.5/10, as of reading about 80 rather lengthy chapters. The older I get, the more specific and niche my taste in fiction gets. It's a curse, but occasionally I can find a salve for the wound. This probably counts.

Hmm.. That's a possibility, though I've never actively applied for overtime. My understanding is that the NHS doesn't pay for unused vacation days, but I'll take a look at the finer details!

I have a confession. I have no idea how my salary works, and my efforts to disentangle it using LLMs has even them scratching their heads.

Why do I bring this up? I just got a 25% raise on my last payslip, and there is absolutely nothing different in terms of work, and I haven't moved to doing more nights or on-calls (or extra locum shifts) which pay more per hour.

Is it due to an increase in seniority? I don't think so, though I'm not certain. It's grossly out of sync with the annual bumps that come with becoming a more senior trainee. If that was the case, I'd have expected a raise in August.

Is it to do with a recent increased pay offer from the Scottish government? The last time it happened, it was a decent boost, but not a whole 25%.

I'm on vacation, and actually examining my payslips requires an NHS computer, so the mystery will persist. If anyone has a clue, I'm all ears, but I'm certainly not looking the gift horse in the mouth, mostly because I'm not a dentist or vet. For now, I'll wait to see if this is a one-off or a regular thing.

Both South Asian or Subcontinental work, since Bangladeshis are at least technically more likely to be in the reference class.

The names aren't "Indian". Islamic names tend to be quite similar across MENA, but I'm confident that the people named aren't, especially the ones for whom we've got pictures. Even going off priors, Indians in the UK don't work those kind of jobs.

I hate these things (why are they always made by people who phrase the prompts so poorly?)

Making a survey meant for the general population is the opposite of easy. We're wordcels on an argumentative wordcel forum, we have an unusually high tolerance for walls of text and complicated questions that consider counterfactuals and nuance.

For example, someone here crawled up my ass when I said that I agreed with (a statement asking if) genocide was evil, asking for a definition of evil. Some poor test writer or psychometrician really can't cover every edge case. Your best bet is to put yourself in a normie state of mind when evaluating a question, if not the answer.

Short of write-ins that are evaluated by humans, the only way to do significantly better is to use some kind of ML classifier or LLM to dig into things. Fuck it, I might see about making such a survey and grader myself. God knows that still won't stop some people from complaining about the validity of the questions, fair critiques or not.

Demis Hassabis? Terence Tao? How deep does this go?

I appreciate them making the chart black and white. Very helpful.

On a more serious note, I wonder if there's something off with the transcription, a lot of text is garbled. I wouldn't put it past the FBI to print and scan images.

I have decided to resign my position effective immediately with BG3 and the Bill and Melinda Gates Foundation.

Larian Studios has quite a lot to answer for, it seems.

talked with Chomsky about racial intelligence differences

WTF, I love Chomsky now.

Mods, remove this if it's a crappy post. It's hard to come up with a through line for this, other than "WOW he knew a lot of people".

It's fine. While more commentary would be nice, there's no need to say something for the sake of saying something.

I'm looking forward to Menace launching on the 5th of Feb. Finally a game that scratches that XCOM itch, while expanding the scope from 6 dudes fighting an entire alien army by themselves to... 40 dudes doing the same thing. Scale counts.

A "Hayekian Minarchist". Somewhere in the middle of the lower right quadrant.

Seems sensible enough. I'm not opposed to the idea of a state for the management of the commons and as a coordination mechanism, but it should be like a child, seen but not heard. The less interference with the affairs of consenting adults, the better. As you can imagine, living in the UK is torturous.

The presumption of competence is a rather... generous for an admin that ran DOGE.

Of course, it's all a hidden 200 IQ play. They're only pretending to be retarded. My opinion is that Minnesota was chosen not because it's an actual hotbed of illegal immigration, but as a show-of-force, to "own the libs".

Unfortunately, studies on the effects of semaglutide after developing Alzheimer's showed null results. But yes, as a preventative agent, it's up there with the best we've got. If you're diabetic or at high risk of developing Alzheimer's, I'd say it's a no brainer. Getting weight under control and improving glucose metabolism probably has a quadrillion other benefits. We're still in the early days.

The only thing preventing it from being a blanket agent, at this point, is the cost. But GLP-1As are only going to get cheaper, and they're already not that expensive.

Muscle wasting on semaglutide is comparable to that seen with equivalent weight loss from intermittent fasting or bariatric surgery. It can be entirely mitigated with concomitant resistance training.

In other words, if you're in a pronounced caloric deficit, you're going to lose a bit of muscle with the fat. It's not a big deal, the health benefits robustly outweigh the risks. There's an ongoing study, LEAN, that looks into it at scale, but preliminary studies support this claim.