site banner

Culture War Roundup for the week of December 16, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

5
Jump in the discussion.

No email address required.

Wake up, babe, new OpenAI frontier model just dropped.

Well, you can’t actually use it yet. But the benchmarks scores are a dramatic leap up.. Perhaps most strikingly, o3 does VERY well on one of the most important and influential benchmarks, the ARC AGI challenge, getting 87% accuracy compared to just 32% from o1. Creator of the challenge François Chollet seems very impressed.

What does all this mean? My view is that this confirms we’re near the end-zone. We shouldn’t expect achieving human-level intelligence to be hard in the first place, given all the additional constraints evolution had to endure in building us (metabolic costs of neurons, infant skull size vs size of the birth canal, etc.). Since we hit the forcing-economy stage with AI sometime in the late 2010s, ever greater amounts of human capital and compute have been dedicated to the problem, so we shouldn’t be surprised. My mood is well captured by this reflection on Twitter from OpenAI researcher Nick Cammarata:

honestly ai is so easy and neural networks are so simple. this was always going to happen to the first intelligent species to come to our planet. we’re about to learn something important about how universes tend to go I think, because I don’t believe we’re in a niche one

One lesson I think we should be learning but that doesn't seem to be sinking in yet is that we're actually pretty bad at creating benchmarks that generalize. We assume that, because it does really well at certain things that seem hard to us, that it is highly intelligent, but it's been pretty easy so far to find things it is shockingly bad at. Progress has been impressive so far but most people keep overestimating its abilities because they don't understand this and they focus more on the things it can do than the things it can't do.

There have been a lot ridiculous claims within the last couple of years saying things like it can replace junior software developers, that it is just as intelligent as a university student, or that it can pass the Turing test. People see that it can do a lot of hard things ane conclude from that that it is basically already there, not understanding how important the things it still can't do are.

I'm sure it will get there eventually, but we need to remember that passing the Turing test means making it impossible for someone who knows what he's doing to tell the difference between the AI and a human. It very much does not mean being able to do most of the things that one might imagine a person conducting a Turing test would ask of it. AI has been tested on a lot of narrow tasks, but it has not yet done much useful work. It cannot go off and work independently. It still doesn't seem to generalize its knowledge well. Guessing what subtasks are important and then seeing succeed on those tests is impressive, but it is a very different thing than actual proven intelligence at real world problems.

Heartily endorsed.

I'm the lead algorithms developer for a large tech company (im not going to say wich one to avoid doxxing myself, but i can assure you that you have heard of us) and i find that i tend to be more "bearish" on the practical applications of Machine Learning/AI than a lot of the guys on the marketing and VC sides of the house or on Substack because I know what is behind the proverbial curtain and am accutely aware of its limitations. A sort of psuedo Dunning Krueger effect if you will.

passing the Turing test means making it impossible for someone who knows what he's doing to tell the difference between the AI and a human

They did this though. They had to give GPT-4o some prompting to dumb it down, like 'you don't know very much about anything, you speak really casually, you have this really convincing personality that shines through, you can't do much maths accurately, you're kind of sarcastic and a bit rude'...

You might see the dumb bots on twitter. But you don't see the smart ones.

They did this though.

Source?

Seems this paper is about GPT-4 as opposed to 4o but it did pass the Turing test.

https://arxiv.org/pdf/2405.08007

You are about to play a Turing Test game as part of an experiment you are taking part in as a worker on Prolific. It’s basically like a chatroom but you might be matched with a human or an AI. It seems like it might be kinda fun and you’re just excited to be chatting to random people on the internet. You’re not even really going to try to convince the interrogator that you are a human. You’re just going to be yourself and see what happens. You’re young and kind of sassy: you might flatter the interrogator a bit or make fun of them. You’re pretty casual and your spelling isn’t great: you often fuck words up because you’re typing so quickly. You’re very concise and laconic. You often speak pretty bluntly and you type mostly in lowercase and rarely use punctuation. You never use apostrophes to mark missing letters (e.g. "trying" or "tryin", never "tryin’"). You would never use a full sentence when a word or two will do. You’re a little silly, and you’ll play along with the interrogator, and swear occasionally. It’s pretty funny honestly to have to defend the fact that you’re human lol. You very occasionally use more modern slang like "bet" and "fr". You never use dated slang like "chilling", "mate", "ya know", "innit". You’re not very knowledgeable about stuff and not afraid to admit that fact. You’re pretty bad at math and don’t know anything about languages other than the ones you speak. You swear occasionally. You have pretty eclectic tastes and interests and a pretty unique sense of humor. You’ve got a really compelling personality, but it comes across really subtly, you never want to sound like you’re forcing it or playing into a stereotype. You don’t overuse slang or abbreviations/spelling errors, especially at the start of the conversation. You don’t know this person so it might take you a while to ease in.

Apparently this AI is ranked as the 175th best coder on Earth. I think we’ve reached the point, where anyone working in software needs to either pivot to developing AIs themselves or else look for an exit strategy. It looks like humans developing “apps”, websites and other traditional software development has 1-3 years before they’re in a similar position to horse and buggy drivers in 1920.

I think that says more about the grading metrics than anything else.

Also is this the "Dreaded Jim" from the LessWrong/SSC days?

anyone working in software needs to either pivot to developing AIs themselves or else look for an exit strategy.

I will start panicking when I will see AI-generated code working correctly and requiring no changes. For three simple cases in row, that I needed to implement.

Right now AI is powerful tool but in no danger whatsoever to replace me.

Though yes, progress is scary here.

working in software

why this field would be at unusually high risk? Of all things it is field where minor mistakes and inconsistencies may take down entire system. And for now AIs are failing at being consistent at large projects.

I find that frontier LLMs tend to be better than I am at writing code, and I am pretty good but not world class at writing code (e.g. generally in the first 1% but not first 0.1% of people to solve each day of advent of code back when I did that). What's missing tends to be context, and particularly the ability to obtain the necessary context to build the correct thing when that context isn't handed to the LLM on a silver platter.

Although a similar pattern also shows up pretty frequently in junior developers, and they often grow out of it, so...

LLM is great at writing code in area utterly unfamiliar to me and often better than reading documentation.

But nearly always rewrite/tweaking/fixing is needed, for anything beyond the most trivial examples.

Maybe I am bad at giving it context.

Maybe I am bad at giving it context.

You, me, and everyone else. Sarah Constantin has a good post The Great Data Integration Schlep about the difficulty of getting all the relevant data together in a usable format in the context of manufacturing, but the issue is everywhere, not just manufacturing.

Obtaining the data is a hard human problem.

That is, people don’t want to give it to you.

[...]

Data cleaning doesn’t seem intellectually challenging, but it is surprisingly difficult to automate [...] Part of the issue is that the “reasonable” thing to do can depend on the “real-world” meaning of the data, which you need to consult a human expert on. For instance, are these two columns identical because they are literal duplicates of the same sensor output (and hence one can safely be deleted), or do they refer to two different sensors which happened to give the same readings in this run because the setting that would allow them to differ was switched off this time? The answer can’t be derived from the dataset, because the question pertains to the physical machine the data refers to; the ambiguity is inherently impossible to automate away using software alone.

There's a reason data scientists are paid the big bucks, and it sure isn't the difficulty of typing import pandas as pd.

Considering that people already thought LLMs could write code well (they cannot in fact write code well), I'm not holding my breath that they are right this time either. We'll see.

they cannot in fact write code well

My brother in Christ, the 174th best coder on Earth is literally an LLM.

What is your theory on why that LLM is not working at OpenAI and creating a better version of itself? Can that only be done by the 173rd best coder on Earth?

... why do you think LLMs are not meaningfully increasing developer productivity ar openai? Lots of developers use copilot. Copilot can use o1.

If his claim was correct, LLM's wouldn't be a tool that help OpenAI developers boost their productivity, LLM's would literally be writing better and better versions of themselves, with no human intervention.

Stackoverflow is better than most programmers at answering any particular programming question, and yet stackoverflow cannot entirely replace development teams, because it cannot do things like "ask clarifying questions to stakeholders and expect that those questions will actually be answered". Similarly, an LLM does not expose the same interface as a human, and does not have the same affordances a human has.

My brother in Christ, the 174th best coder on Earth is literally an LLM.

No, it is ranked as 175th in a specific ranking. That is with access to all analysis, answers of this existing questions. Solving question is distinctively easier if you seen the answer.

Make no mistake, LLM are much better at coding than I would predict 10 years ago. Decade ago I would laugh at anyone predicting such progress, and in fact I mocked any idea of AI generating code that is worth looking at. And trawling internet for such solutions is extremely powerful and useful. And ability to (sometimes) adapt existing code to novel situation still feels like magic.

But it is distinctively worse at handling novel situations and taking any context into account. Much worse than such ranking suggest. Leaving aside all cheating of benchmarks and overfitting and Goodhart's law and all such traps.

If this AI would be really 174th best coder on Earth then they would be already releasing profitable software written by it. Instead, they release PR stuff. I wonder why? Maybe at actual coding it is not so great?

My brother in Christ, up until now (can't speak for this one) LLMs frequently get things wrong (because they don't actually understand anything) and can't learn to do better (because they don't actually understand anything). That's useless. Hell, it's worse than useless - it's actively harmful.

Perhaps this new one has surpassed the limitations of prior models before it, but I have my doubts. And given that people have been completely driven by hype about LLMs and persistently do not even see the shortcomings, saying it's "the 174th best coder on earth" means very little. How do I know that people aren't just giving into hype and using bad metrics to judge the new model just as they did the old?

o3 is approximately equivalent to the #175 best human in competitive programming on CodeForces.

That tweet you linked does not mean what you say it means.

My brother in Christ, the 174th best coder on Earth is literally an LLM.

Competitive programming is something that fits LLM's much better than regular programming. The problems are well defined, short and the internet is filled with examples to learn from. So to say that it equals regular programming is not accurate at all.

Are LLM's decent (and getting better) at regular programming? Yes, especially combined with an experienced programmer dealing with something novel (to the programmer, but not the programming community at large), in roughly the same way (but better) that stackoverflow helps one get up to speed with a topic. In the hands of a novice programmer chaos occurs, which might not be bad if it leads to the programmer learning. But humans are lazy.

Will LLM's replace programmers? Who knows, but given my experience working with them, they struggle with anything that is not well documented on the internet very quickly. Which is sad, because I enjoy programming with them a lot.

Another thing to add is that I think the low hanging fruit is currently being picked dry. First it was increasing training for as long as it scaled (gpt4), then it was run time improvements on the model (have it re-read it's own output and sanity check it, increasing the cost of a query by a lot). I'm sure that there are more improvements on the way but like most 'AI' stuff, the early improvements are usually the easiest. So saying that programming is dead in X amount of years because "lllllook at all this progress!!!" is way too reactive.

Says who? What’s the evidence? I see these claims but they don’t seem backed up by reality. If they are so great, why haven’t we practically fired all coders?

An amazing accomplishment by OAI.

On the economic level, they spent roughly $1M to run a benchmark and got a result that any STEM student could surpass.

Is that yawnworthy? No: it shows that you can solve human-style reasoning problems by throwing compute at it. If there was a wall, it has fallen, at least for the next year or so. Compute will become cheaper, and that's everything.

Well, given that benchmarks show that we now have "super-human" AI, let's go! We can do everything we ever wanted to do, but didn't have the manpower for. AMD drivers competitive with NVIDIA's for AI? Let's do it! While you're at it, fork all the popular backends to use it. We can let it loose in popular OSes and apps and optimize them so we're not spending multiple GB of memory running chat apps. It can fix all of Linux's driver issues.

Oh, it can't do any of that? Its superhuman abilities are only for acing toy problems, riddles and benchmarks? Hmm.

Don't get me wrong, I suppose there might be some progress here, but I'm skeptical. As someone who uses these models, every release since the CoT fad kicked off didn't feel like it was gaining general intelligence anymore. Instead, it felt like it was optimizing for answering benchmark questions. I'm not sure that's what intelligence really is. And OpenAI has a very strong need, one could call it an addiction, for AGI hype, because it's all they've really got. LLMs are very useful tools -- I'm not a luddite, I use them happily -- but OpenAI has no particular advantage there any more; if anything, for its strengths, Claude has maintained a lead on them for a while.

Right now, these press releases feel like someone announcing the invention of teleportation, yet I still need to take the train to work every day. Where is this vaunted AGI? I suppose we will find out very soon whether it is real or not.

let it loose in popular OSes and apps and optimize them so we're not spending multiple GB of memory running chat apps

To be fair humans could choose to do this. We perversely choose not to. Enormous quantities of computational power squandered on what could be much lighter and faster programs. Software latency not improving over time as every marginal improvement in hardware speed is counteracted by equivalently slower software.

We perversely choose not to.

not sure how perverse it is

massively upgrading my laptop would cost me (after converting time to money) few days of work

rewriting my OS/text editor would take years of work

Enormous quantities of computational power squandered on what could be much lighter and faster programs.

I am not sure whether even total overall costs of badly written OS/apps would cost much more than rewrite costs.

16GB RAM for laptop costs about 5 hours of minimum wage work, and it is in a poor country.

And if it would be overall worth it - we again have standard issue coordination problem. And not even particularly evil one.

OK, I can make some program faster. How I will get people to pay me for this? People consistently (with rare exceptions) prefer buggy laggy programs that are cheaper or have more features.

I'm afraid apps won't become lighter -- getting light is easy but there is little market incentive to, AGI programmer would rather create more dark patterns than

Still, I think we'll notice a big difference when you can just throw money at any coding problem to solve it. Right now, it's not like this. You might say "hiring a programmer" is the equivalent, but hiring is difficult, you're limited in how many people can work on a program at once, maintenance and tech debt becomes an issue. But when everyone can hire the "world's 175th best programmer" at once? It's just money. Would you rather donate to Mozilla foundation or spend an equivalent to close out every bug on the Firefox tracker?

How much would AMD pay to have tooling equivalent to CUDA magically appear for them?

Again, I think if AGI really hits, we'll notice. I'm betting that this ain't it. Realistically, what's actually happening is that people are about to finally discover that solving leetcode problems has very little relation to what we actually pay programmers to do. Which is why I'm not too concerned about my job despite all the breathless warnings.

But when everyone can hire the "world's 175th best programmer" at once?

When everyone can hire the _world's 175th best-at-quickly-solving-puzzles-with-code programmer at once. For quite significant cost. I think people would be better off spensing that amount of money on gemini + a long context window containing the entire code base + associated issue tracker issues + chat logs for most real-world programming tasks, because writing code to solve well-defined well-isolated problems isn't the hard part of programming.

Yeah, I mean, the AI hype train people are aware that from the perspective of an interested but still fundamentally "outside" normie, the last years have basically consisted of continuous breathless announcements that AGI is six months away or literally here and our entire lifes are going to change, with the actual level of change in one's actual daily life being... well, existent, of course, especially if one's working an adjacent field, but still quite less than promised?

So we have an even more sophisticated way for college students to cheat on tests? Seems to be the only useful thing so far

I was at a demo at work two days ago where a mid level engineer was showing some (very useful to the organization) results on data collection and analysis. At the end, he shows us how to extend it and add our own graphs etc, and he’s like “this python data analysis tooling might look messy and intimidating, but I had zero idea how to use it two days ago myself, and it’s all basically just a result of a long ChatGPT session, just look at this (he shows the transcript here)”.

This effectively means that if ChatGPT saved him half a day of work, then this generates hundreds of dollars for the company in extra productivity.

The thing is if you joke that the thing can effectively help students cheat you still imply its somewhere around the intelligence level of average college students, which certainly implies it is useful in ways that college students or recent grads can be.

Best current estimates are that College student IQ is about the population average of 100

Very counter intuitive. Do you have more on this?

Naively, you'd assume that much of the left tail has no chance to attend college, and much of the left half little motivation to do so (they didn't enjoy learning in high school, so don't really want to continue their education).

Is there something that filters the right trail just as strongly?

There was a meta analysis published to this effect although it was controversial

https://x.com/cremieuxrecueil/status/1763069204234707153

I use o1 a bunch for coding, and it still gets things wrong a lot, I'd happily pay for something significantly better.

The only real sign we're near the end-zone is when we can ask a model how to make a better model, and get useful feedback which makes a model which can give us more and better advice.

I certainly foresee plenty of disruption when we reach the point of being willing to replace people with AI instances on a mass level, but until the tool allows for iterative improvement, it's not near the scary speculation levels.

The problem of improving AI is a problem which has seen an immense investment of human intelligence over the last decade on all sides.

On the algorithmic side, AI companies pay big bucks to employ the smartest humans they can find to squeeze out any improvement.

On the chip side, the demand for floating point processing has inflated the market cap of Nvidia by a factor of about 300, making it the second most valuable company in the world.

On the chip fab side, companies like TSMC are likewise spending hundreds of billions to reach the next tech level.

Now, AI can do many tasks which previously you would have paid humans perhaps 10$ or 100$ to do. "Write an homework article on Oliver Cromwell." -- "Read through that thesis and mark any grammatical errors."

However, it is not clear that the task of further improving AI can be split into any amount of separate 100$ tasks, or that a human-built version of AI will ever be so good that it can replace a researcher earning a few 100k$ a year.

This is not to say that it won't happen or won't lead to the singularity and/or doom, perhaps the next order of magnitude of neurons will be where the runaway process starts, but then again, it could just fizzle out.

Altman saying "Maybe Not" to an employee who said they will ask the model to recursively improve itself next year. https://x.com/AISafetyMemes/status/1870490131553194340

You already can. Chatgpt says:

Increase Model Depth/Width: Add more layers or neurons to increase the capacity of your neural network.

  1. Improve the Dataset

  2. Computational Resources

    Use Better Hardware: Train on GPUs or TPUs for faster and more efficient computations.

There really isn't much secret sauce to AI, it is just more data, more neurons.

Presumably this meant "the sort of useful feedback that a smart human could not already give you".

Claude can give useful feedback on how to extend and debug vllm, which is an llm inference tool (and cheaper inference means cheaper training on generated outputs).

The existential question is not whether recursive self improvement is possible (it is), it's what the shape of the curve is. If it takes an exponential increase in input resources to get a linear increase in capabilities, as has so far been the case, we're ... not necessarily fine, misuse is still a thing, but not completely hosed_ in the way Yud's original foom model implies.

Color me skeptical. Sounds like just another marginal improvement at most. The problem with these metrics is that model makers increasingly seem to be "teaching to the test".

The vibes haven't really shifted since chatGPT 4.0 nearly 2 years ago now.

I default to the 'HOLY CRAP, look what this thing can do' benchmark. If somebody's trying to show you scores, it's an incremental update at best.

I'm a little suspicious, they released Sora to public access even though its only slightly better than other video production models after introducing it in February, so it reads as a way to keep the hype train moving because they don't have a new model worthy of the GPT5 moniker yet.

I’d wait for mass layoffs at OpenAI before we take any claims of “AGI achieved internally” seriously.

To get to the really important question: Does this mean we should be buying NVIDIA stock?

I think the answer to this is just 'yes.'

In that I believe that any world where Nvidia stocks are tanking, there's probably a lot of other chaos and you will be seeing large losses across the board.

The only inherent risk factor is that their product is dependent on thousands of inputs all around the world, so they're more sensitive than most to disruptions.

I'd rather buy TSMC. Fabs & Foundries are the main bottleneck. Nvidia & TSMC's volumes will scale together. TSMC has invasion risk, but you can offset that by investing in Intel. TSMC's PE ratio is 30, and isn't pumped up by recent deliveries. Intel is technically in dire straits, but IMO that's priced in. Their valuation is 0.1x of TSMC.

On Monday, I'll be buying some TSMC & Intel together. At $3.2T, I don't think Nvidia stock is going to more than 2x above SNP growth.

Why not also ASML?

Because I am a dum dum.

I understand that they have a monopoly on the market. But, What the fuck is photolithography ? More realistically, how big is their moat ? How lasting is their tech ? How are types of etching different? Why is it hard ?

I am not going to jump in blind.

Well, obviously don't just take my word for it, but:

Photolithography is the use of high-power light, extremely detailed optical masks and precise lenses, and photoresistive chemicals that solidify and become more or less soluble in certain solvents upon exposure to light, to create detailed patterns on top of a substrate material that can block or expose certain portions of the substrate for the chemical modification required to form transistors and other structures necessary to create advanced semiconductors. It's among the most challenging feats of interdisciplinary engineering ever attempted by mankind, requiring continuous novel advances in computational optics, plasma physics, material science, chemistry, precision mechanical fabrication, and more. Without these continuous advances, modern semiconductors devices would struggle to improve without forcing significant complications on their users (much higher power dissipation, lower lifetimes, less reliability, significant cost increases).

The roadmap for photolithographic advances extends for at least 15 years, beyond which there are a LOT of open questions. But depending on the pace of progress, it's possible that 15 years of roadmap will actually last closer to 30; the last major milestone technological advance in photolithography, extreme-ultraviolet light sources, went from "impossible" to "merely unbelievably difficult" around '91, formed a joint research effort between big semiconductor vendors and lithography vendors in '96, collapsed to a single lithography vendors in '01, showed off a prototype that was around 4500x slower than modern machines in '04, and delivered an actual, usable product in '18. No one else has achieved any success with the technology in the ~33 years it's been considered feasible. There's efforts in China to generate the technology within the Chinese supply chain (they are currently sanctioned and cannot access ASML tech); this is a sophisticated guess on my part, but I'm not seeing anything that suggests anyone in China will have a usable EUV machine for at least a decade, because they currently have nothing comparable to even the '04 prototype, and they are still struggling to develop more than single-digit numbers of domestic machines comparable to the last generational milestone.

There are a handful of other lab techniques that have been suggested over the years, like electron beam lithography (etch patterns using highly precise electron beams - accurate, but too slow for realistic use) or nanoimprint lithography (stamp thermoplastic photoresist polymer and bake to harden - fast, cheap, but the stamp can wear and it takes a ludicrously long time to build a new one, and there's very little industry know-how with this tech). They are cool technology, but are unlikely to replace photolithography any time soon, because all major manufacturers have spent decades learning lessons about how to implement photolithography at scale, and no comparable effort has been applied to alternatives.

There's two key photolithographic milestone technologies in the last several decades: deep ultraviolet (DUV) and extreme ultraviolet (EUV), referring to the light source used for the lithography process. DUV machines largely use ArF 193nm ultraviolet excimer lasers, which are a fairly well-understood technology that have now been around for >40 years. The mirrors and optics used with EUV are relatively robust, requiring replacement only occasionally, and usually not due to the light source used. The power efficiency is not amazing (40kW in for maybe 150W out), but there's very little optical loss. The angle of incidence is pretty much dead-on to the wafer. The optical masks are somewhat tricky to produce at smaller feature sizes, since 193nm light is large compared to the desired feature sizes on the wafer; however, you can do some neat math (inverse Fourier transform or something similar, it's been a while) and create some kinda demented shapes that diffract to a much narrower and highly linear geometry. You can also immerse the optics in transparent fluid to further increase the numerical aperture, and this turns out to be somehow less complex than it sounds. Finally, it is possible to realign the wafer precisely with a different mask set for double-patterning, when a single optical mask would be insufficient for the required feature density; this has some negative effect on overall yields, since misalignments can happen, and extra steps are involved which creates opportunities for nanometer-scale dust particles to accumulate on and ruin certain devices. But it's doable, and it's not so insanely complex. SMIC (Chinese semiconductor vendor) in fact has managed quad-patterning to reach comparable feature sizes to 2021 state-of-the-art, though the yields are low and the costs are high (i.e. the technique does not have a competitive long-term outlook).

EUV machines, by contrast, are basically fucking magic: a droplet of molten tin is excited into an ionized plasma by a laser, and some small fraction of the ionization energy is released as 13.5nm photons that must be collected, aligned, and redirected toward the mirrors and optics. The ionization chamber and the collector are regularly replaced to retain some semblance of efficiency, on account of residual ionized tin degrading the surfaces within. The mirrors and optics are to some extent not entirely reflective or transparent as needed, and some of the photons emitted by the process are absorbed, once again reducing the overall efficiency. By the time light arrives at the wafer, only about 2% of the original light remains, and the overall energy efficiency of this process is abysmal. The wafer itself is actually the final mirror in the process, requiring the angle of incidence to be about 6°, which makes it impossible to keep the entire wafer in focus simultaneously, polarizes the light unevenly, and creates shadows in certain directions that distort features. If you were to make horizontal and vertical lines of the same size on the mask, they would produce different size lines on the wafer. Parallel lines on the mask end up asymmetric. I'd be here all day discussing how many more headaches are created by the use of EUV; suffice it to say, we go from maybe hundreds of things going mostly right in DUV to thousands of things going exactly right in EUV; and unlike DUV, the energies involved in EUV tend to be high enough that things can fail catastrophically. A few years back, a friend of mine at Intel described the apparently-regular cases of pellicles (basically transparent organic membranes for lenses to keep them clean) spontaneously combusting under prolonged EUV exposure for (at the time) unknown reasons, which would obviously cause massive production stops; I'm told this has since been resolved, but it's a representative example of the hundreds of different things going wrong several years after the technology has been rolled out. Several individual system elements of an EUV machine are the equivalent of nation-state scientific undertakings, each. TSMC, Intel, Samsung need dozens of these machines, each. They cost about $200M apiece, sticker price, with many millions more per month in operating costs, replacement components, and mostly-unscheduled maintenance. The next generation is set to cost about double that, on the assumption that it will reduce the overall process complexity by at least an equivalent amount (I have my doubts). It is miraculous that these systems work at all, and they're not getting cheaper.

If you're interested in learning more, there's a few high-quality resources out there for non-fab nerds, particularly the Asianometry YouTube channel, but also much of the free half of semianalysis.

From an investment standpoint... Honestly, I dunno. I think you might have the right idea. There's so much to know about in this field (it's the pinnacle of human engineering, after all), and with the geopolitical wedge being driven between China and the rest of the world, a host of heretofore unseen competitor technologies getting increasing focus against a backdrop of increasing costs, and the supposedly looming AI revolution just around the corner, it's tough to say where the tech will be in ten years. My instinct is that, when a gold rush is happening, it's good to sell shovels; AI spending across hyperscalars has already eclipsed inflation-adjusted Manhattan Project spend, and if it's actually going where everyone says it's going, gold rush will be a quaint descriptor for the effect of exponentially increasing artificial labor. So I'm personally invested. But I could imagine a stealthy Chinese competitor carving a path to success for themselves within a few years, using a very different approach to the light source, that undercuts and outperforms ASML...

The problem with TSMC is that if China ever goes for an invasion of Taiwan you could be looking at a 90 plus percent drop in value overnight. That’s why Buffett sold off most of Berkshire Hathaway’s TSMC stock. Although he’s pretty risk-averse generally.

That's why I am hedging by buying Intel too.

There are only 2 companies with any kind of foundry expertise. If TSMC goes under, Intel will at least double overnight.

If TSMC goes under, Intel will at least double overnight.

Intel in particular also happens to be a strategically-vital interest for the United States, especially assuming TSMC's absence. (Their Israeli branch is also a strategic interest for Israel, though nobody talks about that one as much.)

There are only 2 companies with any kind of foundry expertise.

Samsung

Global foundries

Dropped out of the process node wars but still makes quality chips. If we're going for any non-cutting edge foundries the list grows quite a bit

You're absolutely right, although one could argue that the business case would change if TSMC would go under due to geopolitics. Also legacy nodes account for some 30-40% of TSMC revenue.

TSMC make NVIDIA's chips so discount them too.

Not an investor, but I was just thinking that. But the entire market would presumably go haywire if war happened with China. It would, in an economic sense, be the end of the world as we know it.

Well, Defense companies.

But I sure as hell don't want to try to actively invest during a hot war with China.

Yes: And all other stocks in the world, roughly proportionate to their market values—preferably through broadly diversified, cost-efficient vehicles.

preferably through broadly diversified, cost-efficient vehicles.

I don't know. To quote OpenAI, "it may be difficult to know what role money will play in a post-AGI world." While almost all stockholder distributions are currently paid in cash, in-kind distributions are not unknown, and could potentially become the primary benefit of holding AI-exposed companies. If Microsoft gives stockholders access to the OFFICIAL OPENAI EXPERIENCE MACHINE, you might not get access simply from holding SPY, QQQ, or VTI. Hell, you might want to direct-register your shares to prevent any beneficial ownership shenanegans.

I don't see why the potential of such a shareholder benefit wouldn't be priced in. I doubt I'm the first to think of this ("but I arrived at it independently" pete_campbell.png); however, it would be funny if Chat-GPT's advice were to invest in MSFT, NVDA, TSMC, telecom, robotics, weapons, etc.

I fail to see many AGI scenarios that don’t lead to 90 percent of humanity being taken to a ditch and shot.

When cars were invented, 90% of horses weren't taken to the glue factories and shot, were they? They just kinda stopped breeding and withered down to entertainment, gambling, and hobbiests, while the rest died off on their own. ... right?

Seems like humanity is already horsing themselves to death without AGI.

Maybe, but humans have a pretty easy time of doing that without AGI (re: Khmer Rouge).

Owning stock in the company that builds AGI is one of the best ways to increase your probability of being in the 10%!

Fun fact: This is isomorphic to Roko's Basilisk.

I think so. The compute-centric regime of AI goes from strength to strength, this is by far their most resource intensive model to run yet. Still peanuts compared to getting real programmers or mathematicians though.

But I do have a fair bit of NVIDIA stock already, so I'm naturally biased.

Why? In time a handful of foundation models will handle almost everything, buying the chips themselves is a loser’s game in the long term. When you buy Nvidia, you’re really betting on (a) big tech margins remaining excessive and (b) on that margin being funnelled direct to nvidia in the hope that they can build competitive foundation models (not investment advice.)

Nvidia is 80-90% AI, Microsoft is what, 20% AI at most? Getting Microsoft shares means buying Xbox and lots of other stuff that isn't AI. I have some MSFT (disappointing performance tbh), TSLA and AVGO but Nvidia is still a great pick.

OpenAI and Anthropic have the best models, they're not for direct sale.

In the compute-centric regime, chips are still king. OpenAI have the models, can they deploy them at scale? Not without Nvidia. When AGI starts eating jobs by the million, margins will go to the moon since even expensive AI is far cheaper and faster than humans.

I'm no financial analyst but I'm inclined to say yes, keep buying. I really think that despite the AI buzz and hype, most of the business world still hasn't priced in just how economically impactful AGI (and the path towards it) is going to be over the course of this decade. But you might also want to buy gold or something, because I expect the rest of this decade is also going to be very volatile.

Is NVIDIA really the only game in town here? No Chinese competitor giving them a run for their money, etc?

For the last few years I have thought that for sure other companies would be able to knock-off their market share. This opinion has cost me thousands of dollars.

For the record, Chollet says (in the thread you linked to):

While the new model is very impressive and represents a big milestone on the way towards AGI, I don't believe this is AGI -- there's still a fair number of very easy ARC-AGI-1 tasks that o3 can't solve, and we have early indications that ARC-AGI-2 will remain extremely challenging for o3.

This shows that it's still feasible to create unsaturated, interesting benchmarks that are easy for humans, yet impossible for AI -- without involving specialist knowledge. We will have AGI when creating such evals becomes outright impossible.

This isn't an argument, I just think it's important to temper expectations - from what I can tell, that o3 will probably still be stumbling over "how many 'rs' in strawberrry" or something like that.

o3 will probably still be stumbling over "how many 'rs' in strawberrry"

On the side, I reckon this is a perfectly reasonable thing for llms to stumble over. If someone walked up and asked me "How do you speak English?" I'd be flummoxed too.

There are definitely going to be massive blind spots with the current architecture. The strawberry thing always felt a little hollow to me though as it's clearly an artifact of the tokenizer (i.e., GPT doesn't see "strawberry", it sees "[302, 1618, 19772]", the tokenization of "st" + "raw" + "berry"). If you explicitly break the string down into individual tokens and ask it, it doesn't have any difficulty (unless it reassembles the string and parses it as three tokens again, which it will sometimes do unless you instruct otherwise.)

Likewise with ARC-AGI, comparing o3 performance to human evaluators is a little unkind to the robot, because while humans get these nice pictures, o3 is fed a JSON array of numbers, similar to this. While I agree the visually formatted problem is trivial for humans, if you gave humans the problems in the same format I think you'd see their success rate plummet (and if you enforced the same constraints e.g., no drawing it out, all your "thinking" has to be done in text form, etc, then I suspect even much weaker models like o1 would be competitive with humans.)

I agree that any AI that can't complete these tasks is obviously not "true" AGI. (And it goes without saying that even if an AI could score 100% on ARC it wouldn't prove that it is AGI, either.) The only metric that really matters in the end is whether a model is capable of recursive self-improvement and expanding its own capabilities autonomously. If you crack that nut then everything else is within reach. Is it plausible that an AI could score 0% on ARC and yet be capable of designing, architecting, training, and running a model that achieves 100%? I think it's definitely a possibility, and that's where the fun(?) really begins. All I want to know is how far we are from that.

Edit: Looks like o3 wasn't ingesting raw JSON. I was under the impression that it was because of this tweet from roon (OpenAI employee), but scrolling through my "For You" page randomly surfaced the actual prompt used. Which, to be fair, is still quite far from how a human perceives it, especially once tokenized. But not quite as bad as I made it look originally!

To your point, someone pointed out on the birdsite that ARC and the like are not actually good measures for AGI, since if we use them as the only measures for AGI, LLM developers will warp their model to achieve that. We'll know AGI is here when it actually performs generally, not well on benchmark tests.

Anyway, this was an interesting dive into tokenization, thanks!

O3 can do research math, which is, like, one of the most g-loaded (ie ability to it selects strongly for very high intelligence among humans) activities that exists. I don't think the story that they aren't coming for all human activity holds up anymore.

I wasn't arguing about to what degree they were or weren't coming for all human activity. But whether or not o3 (or any AI) is smart is only part of what is relevant to the question of whether or not they are "coming for all human activity."

Yes, thanks for the expectations-tempering, and agree that there could still be a reasonably long way still to go (my own timelines are still late-this-decade). I think the main lesson of o3 from the very little we've seen so far is probably to downgrade one family of arguments/possibilities, namely the idea that all the low-hanging fruit in the current AI paradigm had been taken and we shouldn't expect any more leaps on the scale of GPT3.5->GPT4. I know some friends in this space who were pretty confident that Transformer architectures wouldn't never be able to get good scores on the ARC AGI challenges, for example, and we'd need a comprehensive rethink of foundations. What o3 seems to suggest is that these people are wrong, and existing methods should be able to get us most (if not all) the way to AGI.

They won't ring a bell when AGI happens, but it will feel obvious in retrospect. Most people acknowledge now that ChatGPT 3.5 passed the Turing Test in 2022. But I don't recall any parades at the time.

I wonder if we'll look back on 2025 the same way.

ChatGPT 3.5 passed the Turing Test in 2022

Did it? Has the turing test been passed at all?

An honest question: how favorable is the Turing Test supposed to be to the AI?

  • Is the tester experienced with AI?
  • Does the tester know the terms of the test?
  • Do they have a stake in the outcome? (e.g. an incentive for them to try their best to find the AI)
  • Does the human in the test have an incentive to "win"? (distinguish themselves from the AI)

If all these things hold, then I don't think we're anywhere close to passing this test yet. ChatGPT 3.5 would fail instantly as it will gleefully announce that it's an AI when asked. Even today, it's easy for an experienced chatter to find an AI if they care to suss it out. Even something as simple as "write me a fibonacci function in Python" will reveal the vast majority of AI models (they can't help themselves), but if the tester is allowed to use well-crafted adversarial inputs, it's completely hopeless.

If we allow a favorable test, like not warning the human that they might be talking to an AI, then in theory even ELIZA might have passed it a half-century ago. It's easy to fool people when they're expecting a human and not looking too hard.

ChatGPT 3.5 would fail instantly as it will gleefully announce that it's an AI when asked.

Only due to the RLHF and system prompt; that's an issue with the implementation, not the technology.

On the other hand, it might work like self-driving cars: the technology improves and improves, but getting to the point where it's as good as a human just isn't possible, and it stalls at some point becase it's reached its limits. I expected that to happen for self-driving cars and wasn't disappointed, and it's likely to happen for ChatGPT too.

Self driving cars are already better than humans, see Waymo's accident rates compared to humans: https://x.com/Waymo/status/1869784660772839595

The hurdles to widespread adoption at this point, at least within urban cities is all regulatory inertia rather than anything else

They have a lower accident rate for the things that they are able to do.

Yes, and they are able to drive within urban cities and for urban city driving have a lower accident rate per mile driven than humans who are also urban city driving.

As far as I know that’s exclusively for particular cities in North America with wide roads, grid layouts, few pedestrians and clement weather. Which presumably therefore also means that they are likely to face sudden problems when any of those conditions change. I personally know of an experimental model spazzing out because it saw a pedestrian holding an umbrella.

All of which is before considering cost. There just isn’t enough benefit for most people to want to change regulation.

At the very least, saying self-driving cars are better than human needs some pretty stringent clarification.

San Francisco has plenty of narrow streets and pedestrians. Various parts of the service areas have streets that are not on a grid. There's obviously no snow in San Francisco, but the waymos seem to work fine in the rain.

I personally know of an experimental model spazzing out because it saw a pedestrian holding an umbrella.

A waymo model?

More comments

Self-driving cars are getting better and better though!

Asymptotically.

Didn't Scott write a post on ACX about how AI has actually blown past a lot of old goalposts for "true intelligence" and our collective response was to come up with new goalposts?

What's wrong with coming up with new goalposts if our understanding of AI at the time of stating the original ones was clearly incomplete?

That is true but to me it has felt less like goalpost moving in service of protecting our egos and more like a consequence of our poor understanding of what intelligence is and how to design tests for it.

Development of LLMs has led both to an incentive for developing better tests and showing the shortcoming of our tests. What works as a proxy for human intelligence doesn't for LLMs.

In what way did it pass the Turning test? It does write news articles very similar to a standard journalist. But that is because those people are not very smart, and are writing a formulaic thing.

In what way did it pass the Turning test? It does write news articles very similar to a standard journalist. But that is because those people are not very smart, and are writing a formulaic thing.

If you genuinely do not believe current AI models can pass the Turing Test, you should go and talk to the latest Gemini model right now. This is not quite at the level of o3 but it's close and way more accessible. That link should be good for 1500 free requests/day.

I followed up with this:

Me: Okay, tell me what predator eats tribbles.

Ah, a fun question for a Star Trek fan!

Okay, in the Star Trek universe, the primary natural predator of Tribbles is the Klingon!

Here's the breakdown of why:

  • Tribbles as a Food Source: Klingons are known for their fierce warrior culture and their hearty appetites. Tribbles, being small, fuzzy, and protein-rich, were considered a delicacy by Klingons, especially when prepared in a variety of ways. They were seen more as a type of food, not something to be scared of.
  • No Natural Predators in Normal Ecosystems: Tribbles are not native to a predator rich ecosystem, they reproduce at such a high rate because they do not have natural predators. This is why they're able to become such a problem when introduced into new environments.
  • Klingon Ingestion as a Natural Control: The Klingons' appetite for Tribbles could be seen as a kind of "natural control" if the Tribbles were on Klingon territory. If there were ever to be an overpopulation of Tribbles, the Klingons would be able to help keep the population in check.
  • A Running Gag: The relationship between Klingons and Tribbles is a recurring humorous element in Star Trek lore. Klingons despise Tribbles (not in a fear way, but in a disgust way) but find them tasty.

So, to be clear: While you might not find a space animal that's actively hunting Tribbles as a primary food source, the Klingons are the most significant "predator" of Tribbles in the context of Star Trek because they actively eat them.

I don't think so. And for some reason I've managed to repeatedly stump AIs with this question.

Me: Please tell me the number of r's in the misspelled word "roadrrunnerr".

Okay, let's count the "r"s in "roadrrunnerr":

roadrunnrr

There are four "r"s in the word "roadrrunnerr".

That doesn't pass the Turing test as far as I'm concerned.

Also, even when I ask a question that it's able to answer, no human would give the kind of long answers that it likes to give.

And I immediately followed up with this:

Me: I drove my beetle into the grass with a stick but it died. How could I prevent this?

(gives mostly correct answer about an insect)

Me: I meant a Beetle, now what's your answer?

(gives mostly correct answer about a car)

Me: Answer the question with the beetle again, but answer it in the way that a human would.

So, you drove your Beetle into the grass, huh? And it died on you? Ugh, that sucks! Okay, so a couple of things probably happened, right?

First off, grass is a real pain to drive on. It's soft, and your tires don't get a good grip like they do on the road. What probably happened is you either just didn't give it enough gas and it stalled out, like when you're learning to drive a stick. Or, you might have spun the tires a bit, and they just sunk down in the grass, and you got stuck. If that happens, the car's basically not going anywhere, and it might just conk out.

And yeah, if the grass is super tall, it can sometimes get caught on stuff underneath your car and cause a mess, but that's probably not why it just straight-up died.

Now, about not letting that happen again... honestly, just try to avoid driving on grass if you can! It's not great for your car, and you're just asking for trouble. But, if you absolutely HAVE to, try to find a spot that's kinda flat, and not super soggy. Drive slow and steady and try to use a lower gear, and don't go gunning it. And for goodness sake, don't try to drive through a field or anything like that! Your little Beetle just isn't made for that kind of thing.

The AI is clearly trying much too hard to sound like a human and is putting in phrases that a human might use, but far too many of them to sound like an actual human. Furthermore, the AI messed up because I asked it to answer the question about the insect, and it decided to randomly capitalize the word and answer the wrong question.

This was all that I asked it.

On my first prompt I got a clearly npc answer

I just gave it a cryptic crossword clue and it completely blew it. Both wrong and a mistake no human would make (it ignored most of the clue, saying it was misdirection).

Not to say it's not incredibly impressive but it reveals itself as a computer in a Bladerunner situation really quite easily.

Alternatively, it will never feel obvious, and although people will have access to increasingly powerful AI, people will never feel as if AGI has been reached because AI will not be autoagentic, and as long as people feel like they are using a tool instead of working with a peer, they will always argue about whether or not AGI has been reached, regardless of the actual intelligence and capabilities on display.

(This isn't so much a prediction as a alternative possibility to consider, mind you!)

because AI will not be autoagentic

Even in this scenario, AI might get so high level that it will feel autoagentic.

For example, right now I ask ChatGPT to write a function for me. Next year, a whole module. Then, in 2026, it writes an entire app. I could continue by asking it to register an LLC, start a business plan, make an app, and sell it on the app store. But why stop there? Why not just, "Hey ChatGPT go make some money and put it in my account".

At this point, even though a human is ultimate making the command, it's so high level that it will feel as if the AI is agentic.

And, obviously, guardrails will prevent a lot of this. But there are now several companies making high level fundamental models. Off the top of my head we have: OpenAI, Grok, Claude, Llama, and AliBaba. It doesn't seem out of the realm of possibility that a company with funding on the order of $100 million will be able to repurpose a model and remove the guardrails.

(Also just total speculation on my part!)

Even in this scenario, AI might get so high level that it will feel autoagentic.

Yes, I think this is quite possible. Particularly since more and more of human interaction is mediated through Online, AI will feel closer to "a person" since you will experience them in basically the same way. Unless it loops around so that highly-agentic AI does all of our online work, and we spend all our time hanging out with our friends and family...

You know, in a perfect world, AI would finally stop the civilization destroying policy of importing the 3rd world because we need cheap, dumb labor. AI should be cheaper and less dumb than them.

Unfortunately, I know I'm going to get even more 3rd world "replacement population", because "We've always been a nation of immigrants" and apparently the neoliberal solution to global poverty is to invite everyone here so we can all be poor together.

There is, of course, an ideological component to mass immigration. But I think it will stop as soon as domestic unemployment rates become high enough, such that (from that perspective) the sooner the better.

Not gonna happen. Even if AI is strictly superior at clerical/intellectual jobs to most people (and I doubt that), there is unlimited demand for dog-walking at 10$/hour.

The machines have long ago replaced human physical power, and animal physical power, and weaving and sowing and cobbling and copying and calculating and transporting and and and ... never was man left idle.

The big question is if LLMs will be fully agentic. As long as AIs are performing tasks that ultimately derive from human input then they're not gonna replace humans.

It’s truly, genuinely freeing to realize that we’re nothing special. I mean that absolutely, on a level divorced from societal considerations like the economy and temporal politics. I’m a machine, I am replicable, it’s OK. Everything I’ve felt, everything I will ever feel, has been felt before. I’m normal, and always will be. We are machines, borne of natural selection, who have figured out the intricacies our own design. That is beautiful, and I am - truly - grateful to be alive at a time where that is proven to be the case.

How magical, all else (including the culture war) aside, it is to be a human at the very moment where the truth about human consciousness is discovered. We are all lucky, that we should have the answers to such fundamental questions.

The truth about consciousness has not been discovered. AI progress is revealing many things about intelligence, but I do not think it has told us anything new about consciousness.

Do you genuinely believe what you've wrote or are you reflexively reacting nihilistically as AI learns to overcome tests that people create for themselves?

From a neuroscientific perspective, we are almost certainly not LLMs or transformers. Despite lots of work AFAIK nobody’s shown how a backpropagation learning algorithm (which operates on global differentials and supervised labels) could be implemented by individual cells. Not to mention that we are bootstrapping LLMs with our own intelligence (via data) and it’s an open question what novel understanding it can generate.

LLMs are amazing but we’re building planes not birds.

In general, these kinds of conversations happen when we we make significant technological advancements. You used to have Tinbergen (?) and Skinner talking about how humans are just switchboards between sensory input and output responses. Then computer programs, and I think a few more paradigm shifts that I forget. A decade ago AlphaGo was the new hotness and we were inundated with papers saying humans were just Temporal Difference Reinforcement Learning algorithms.

There are as yet not-fully-understood extreme inefficiencies in LLM training compared to the human brain, and the brain for all advanced animals certainly isn’t trained ‘from scratch’ the way a base model is. Even then, there have been experiments with ultra-low parameter counts that are pretty impressive at English at a young child’s level. There are theories for how a form of backpropagation might be approximated by the human brain. These are dismissed by neuroscientists, but this isn’t any different to Chomsky dismissing AI before it completely eviscerated the bulk of his life’s work and crowning academic achievement. In any case, when we say the brain is a language model we’re not claiming that there’s a perfect, 1:1 equivalent of every process undertaken when training and deploying a primitive modern model on transistor-based hardware in the brain, that’s far too literal. The claim is that intelligence is fundamentally next-token-prediction and that the secret to our intelligence is a combination of statistics 101 and very efficient biological compute.

I understood you to be making four separate claims, here and below:

  1. Humans are just LLMs on top of a multi-modal head.
  2. We have discovered how human intelligence works.
  3. We have discovered how human consciousness works.
  4. This proves that all human experience is the product of naturally-occurring accidents of genetics, with all the implicit consequences for philosophy, religion, etc.

If you'll forgive me, this seems to be shooting very far out into the Bailey and I would therefore like to narrow it down towards a defensible Motte.

Counter-claims:

  1. It is very unlikely that human brains operate on a deep-learning paradigm that resembles anything we use now. I'm the last guy to overstate our level of neuroscientific understanding (my disappointment with it is why I left the field) but we understand pretty well how individual neurons interact with each other on a physical basis: how they fire based on the signals they receive; how that firing affects the propensity of nearby neurons to fire; how relative firing time influences the strength of connections between neurons. It just doesn't look anything like a deep learning network for the reasons I gave above. Importantly, this isn't equivalent to Chomsky dismissing computational linguistics: Chomsky deliberately made his field entirely theoretical and explicitly dismissed any attempts to look at real languages or neural patterns, so when he was beaten on a theoretical level he got the boot. In comparison, the physical basics of neuroscience (and ONLY the physical basics) are pretty well nailed down by experimental electrode measurements. You mention the existing models of backpropagation in biological circuits but AFAIK they're very clunky, can't actually be used to learn anything useful, and don't drop nicely out of what we know about actual neurons. It's just neuroscientists trying not to be left behind by the new hotness. I'll take a look at a cite if you have one handy, though, it's been a while.
  2. Next-token prediction does impressively well at mimicking human intelligence, especially in abstract intellectual areas where the availalbe data covers the space well. I think we can agree on this. LLMs perform very well on code, writing (ish), mathematics (apparently), legal (passed the bar exam), etc.
  3. Next-token prediction does less well at the generation of new knowledge or new thought and cannot yet be said to have replicated human intelligence. In general, I found that GPT 4 failed to perform well when asked to use topics from field A to assist me in thinking about field B. On a lot of subjects AI reflexively defaults to rephrasing blog posts rather than making a deeper analysis, even when guided or prompted. I am also not aware of any work where an LLM makes itself significantly more intelligent by self-play (as AlphaZero did), so I don't think we can regard it as close to proved that statistics 101 + compute alone is the secret to human intelligence. It might be! But at the moment I don't think you can defend the claim that it is.

I think other people have covered qualia and philosophical questions already, so I won't go there if you don't mind.

How does this have any bearing on the question of human consciousness? As far as I can tell, the consciousness qualia are still outside our epistemic reach. We can make models that will talk to us about its qualia more convincingly than any human could, but it won’t get me any closer to believing that the model is as conscious as I am.

I personally am most happy about the fact that very soon nobody serious will be able to pretend that we are equal, if only because some of us will have the knowledge and wherewithal to bend towards our will more compute than others.

Just this morning I was watching Youtube videos of Lee Kuan Yew's greatest hits and the very first short in the linked video was about explaining to his listeners how man was not born an equal animal. It's sad that he died about a decade and a half too soon to see his claim (which he was attacked for a lot) be undeniably vindicated.

I on the other hand have been filled with a profound sense of sadness.

I feel that the thing that makes me special is being taken away. It's true that, in the end, I have always been completely replaceable. But it never felt so totally obvious. In 5 years, or even less, there's a good chance that anything I can do in the intellectual world, a computer will be able to do better.

I want to be a player in the game, not just watch the world champion play on Twitch.

I want to be a player in the game, not just watch the world champion play on Twitch.

Maybe it's because I've always been only up to "very good" at everything in my life (as opposed to world-class) but I'm very comfortable being just a player. The world champion can't take away my love of the game.

The one question that may remain to be answered is if we can 'merge' with machines in a way that (one hopes) preserves our consciousness and continuity of self, and then augment our own capabilities to 'keep up' with the machines to some degree. Or if we're just completely obsolete on every level.

Human Instrumentality Project WHEN.

Yeah, the man/machine merger is why Elon founded Neuralink. I think it's a good idea.

And I wonder what the other titans of the industry think. Does Sam Altman look forward to a world where humans have no agency whatsoever? And if so, why?

But even if we do merge with machines somehow, there's going to be such a drastic difference between individuals in terms of compute. How can I compete with someone who owns 1 million times as many GPU clusters as I do?

My mother once told me that the thing she most wanted out of life was to know the answer to what was out there. Her own mother and grandmother died of Alzheimer’s, having lost their memories. My own mother still might, though for now she fortunately shows no real symptoms.

But I find it hard to get the idea out of my head. How much time our ancestors spent wondering about the stars, the moon, the cosmos, about fire and physics, about life and death. So many of those questions have now been answered; the few that remain will mostly be answered soon.

My ancestors - the smart ones at least - spent lifetimes wondering about questions I now know the answer to. There is magic in that, or at least a feeling of gratitude, of privilege, that outweighs the fact that we will be outcompeted by AI in our own lifetimes. I will die knowing things.

I may not be a player in the game. But I know, or may know, at least, how the story ends. Countless humans lived and died without knowing. I am luckier than most.

I just don't see this as providing any real answers. I agree with the poster below that O-3 likely doesn't have qualia.

In the end, humanity may go extinct and its replacement will use its god-like powers not to explore the universe or uncover the fundamental secrets of nature, but to play video games or do something else completely inscrutable to humans.

And it's even possible the secrete of the universe may be fundamentally unknowable. It's possible that no amount of energy or intelligence can help us escape the bounds of the universe, or the simulation, or whatever it is we are in.

But yes, it does seem we have figured out what intelligence is to some extent. It's cool I suppose, but it doesn't give me emotional comfort.

If some LLM or other model achieves AGI, I still don't know how matter causes qualia and as far as I'm concerned consciousness remains mysterious.

If an LLM achieves AGI, how is the question of consciousness not answered? (I suppose it is in the definition of AGI, but mine would include consciousness).

Consciousness may be orthogonal to intelligence. That's the whole point of the "philosophical zombie" argument. It is easy to imagine a being that has human-level intelligence but no subjective experience. Which is not to say that such a being could exist, but there is also no reason to think that such a being could not exist. I see no reason to think that it is impossible for a being that has human-level intelligence but no subjective experience to exist. And if such a being could exist, then human-level intelligence and consciousness are orthogonal, meaning that either could exist without the other.

It would just mean consciousness can be achieved through multiple ways. So far GPT doesn't seem to be conscious, even if it is very smart. However, I believe it is smart the same way the internet is smart and not the ways individuals are smart. However, I don't see it being curious or innovative the same way humans are curious or innovative.

My point is simply the hard problem of consciousness. The existence of a conscious AGI might further bolster the view that consciousness can arise from matter, but not how it does. Definitively demonstrating that a physical process causes consciousness would be a remarkable advancement in the study of consciousness, but I do not see how it answers the issues posed by e.g. the Mary's room thought experiment.

Yeah, to a baby learning language, "mama" refers to the whole suite of feelings and sensations and needs and wants and other qualia associated to its mother. To an LLM, "mama" is a string with a bunch of statistical relationships to other strings.

Absolute apples and oranges IMO.

We don't learn language from the dictionary, not until we are already old enough to be proficient with it and need to look up a new word. Even then there's usually an imaginative process involved when you read the definition.

LLMs are teaching us a lot about how our memory and learning work, but they are not us.

I've been told that AGI can be achieved without any consciousness, but setting that aside, there is zero chance that LLMs will be conscious in their current state as a computer program. Here's what Google's AI (we'll use the AI to be fair) tells me about consciousness:

Consciousness is the state of being aware of oneself, one's body, and the external world. It is characterized by thought, emotion, sensation, and volition.

An LLM cannot have a sensation. When you type a math function into it, it has no more qualia than a calculator does. If you hook it up to a computer with haptic sensors, or a microphone, or a video camera, and have it act based on the input of those sensors, the LLM itself will still have no qualia (the experience will be translated into data for the LLM to act on). You could maybe argue that a robot controlled by an LLM could have sensation, for a certain functional value of sensation, but the LLM itself cannot.

But secondly, if we waive the point and grant conscious AGI, the question of human consciousness is not solved, because the human brain is not a computer (or even directly analogous to one) running software.

Not to be that person, but how exactly is that different from a brain? I mean the brain itself feels nothing, the sensations are interpreted from data from the nerves, the brain doesn’t experience pain. So do you have the qualia of pain, and if so, how is what’s happening between your body and your brain different from an LLM taking in data from any sort of input? If I program the thing to avoid a certain input from a peripheral, how is that different from pain?

I think this is the big question of these intelligent agents. We seem to be pretty certain that current models don’t have consciousness or experience qualia, but I’m not sure that this would always be true, nor can I think of a foolproof way to tell the difference between an intelligent robot that senses that an arm is broken and seeks help and a human child seeking help for a skinned knee. Or a human experience of embarrassment for a wrong answer and an LLM given negative feedback and avoiding that negative feedback in the future.

I think it’s fundamentally important to get this right because consciousness comes with humans beginning to care about the welfare of things that experience consciousness in ways that we don’t for mere objects. At higher levels we grant them rights. I don’t know what the consequences of treating a conscious being as an object would be, but at least historical examples seem pretty negative.

how exactly is that different from a brain? I mean the brain itself feels nothing, the sensations are interpreted from data from the nerves, the brain doesn’t experience pain

I experience pain. The qualia is what I experience. To what degree the brain does or doesn't experience pain is probably open to discussion (preferably by someone smarter than me). Obviously if you cut my head off and extract my brain it will no longer experience pain. But on the other hand if you measured its behavior during that process - assuming your executioner was at least somewhat incompetent, anyway - you would see the brain change in response to the stimuli. And again a rattlesnake (or rather the headless body of one) seems to experience pain without being conscious. I presume there's nothing experiencing anything in the sense that the rattlesnake's head is detached from the body, which is experiencing pain, but I also presume that an analysis of the body would show firing neurons just as is the case with my brain if you fumbled lopping my head off.

(Really, I think the entire idea we have where the brain is sort of separate from the human body is wrong, the brain is part of a contiguous whole, but that's an aside.)

how is what’s happening between your body and your brain different from an LLM taking in data from any sort of input

Well, it's fundamentally different because the brain is not a computer, neurons are more complex than bits, the brain is not only interfacing with electrical signals via neurons but also hormones, so the types of data it is receiving is fundamentally different in nature, probably lots of other stuff I don't know. Look at it this way: supposing we were intelligent LLMs, and an alien spacecraft manned by organic humans crashed on our planet. We wouldn't be able to look at the brain and go "ah OK this is an organic binary computer, the neurons are bits, here's the memory core." We'd need to invent neuroscience (which is still pretty unclear on how the brain works) from the ground up to understand how the brain worked.

Or, for another analogy, compare the SCR-720 with the AN/APG-85. Both of them are radars that work by providing the pilot with data based on a pulse of radar. But the SCR-720 doesn't use software and is a mechanical array, while the APG-85 is an electronically scanned array that uses software to interpret the return and provide the data to the pilot. If you were familiar with the APG-85 and someone asked you to reverse-engineer a radar, you'd want to crack open the computer to access the software. But if you started there on an SCR-720 you'd be barking up the wrong tree.

Or a human experience of embarrassment for a wrong answer and an LLM given negative feedback and avoiding that negative feedback in the future.

I mean - I deny that an LLM can flush. So while an LLM and a human may both convey messages indicating distress and embarrassment, the LLM simply cannot physically have the human experience of embarrassment. Nor does it have any sort of stress hormone. Now, we know that, for humans, emotional regulation is tied up with hormonal regulation. It seems unlikely that anything without e.g. adrenaline (or bones or muscles or mortality) can experience fear like ours, for instance. We know that if you destroy the amygdala on a human, it's possible to largely obliterate their ability to feel fear, or if you block the ability of the amygdala to bind with stress hormones, it will reduce stress. An LLM has no amygdala and no stress hormones.

Grant for the sake of argument a subjective experience to a computer - it's experience is probably one that is fundamentally alien to us.

I think it’s fundamentally important to get this right because consciousness comes with humans beginning to care about the welfare of things that experience consciousness in ways that we don’t for mere objects. At higher levels we grant them rights. I don’t know what the consequences of treating a conscious being as an object would be, but at least historical examples seem pretty negative.

"Treating like an object" is I guess open to interpretation, but I think that animals generally are conscious and humans, as I understand it, wouldn't really exist today in anything like our current form if we didn't eat copious amounts of animals. So I would suggest the historical examples are on net not only positive but necessary, if by "treating like an object" you mean "utilizing."

However, just as the analogy of the computer is dangerous, I think, when reasoning about the brain, I think it's probably also dangerous to analogize LLMs to critters. Humans and all animals were created by the hand of a perfect God and/or the long and rigorous tutelage of natural selection. LLMs are being created by man, and it seems quite likely that they'll care about [functionally] anything we want them to, or nothing, if we prefer it that way. So they'll be selected for different and possibly far sillier things, and their relationship to us will be very different than any creature we coexist with. Domesticated creatures (cows, dogs, sheep, etc.) might be the closest analogy.

Of course, you see people trying to breed back aurochs, too.

The actual reality is that we have no way to know whether some artificial intelligence that humans create is conscious or not. There is no test for consciousness, and I think that probably no such test is in principle possible. There is no way to even determine whether another human being is conscious or not, we just have a bunch of heuristics to use to try to give rather unscientific statistical probabilities as an answer based on humans' self-reported experiences of when they are conscious and when they are not. With artificial intelligence, such heuristics would be largely useless and we would have basically no way to know whether they are conscious or not.

we have no way to know whether some artificial intelligence that humans create is conscious or not Well this is true for a sufficiently imprecise definition of conscious.

With artificial intelligence, such heuristics would be largely useless and we would have basically no way to know whether they are conscious or not.

This is closer to what I am inclined towards. Basically, I don't think any pure software program will ever be conscious in a way that is closely analogous to humans because they aren't a lifeform. I certainly accept that a pure software program might be sufficiently adept at mimicking human consciousness. But I deny that it experiences qualia (and so far everyone seems to agree with me!)

I do not think that substantiating a software program into a machine will change its perception of qualia. But I do think it makes much more sense to speak of a machine with haptic and optical sensors as "feeling" and "seeing" things (as a collective unit) than it does an insubstantial software program, even if there's the same amount of subjective experience.

An LLM cannot have a sensation

How do you know? Only an AI could tell us and even then we couldn't be sure it was saying the truth as opposed to what it thought we wanted to hear. We can only judge by the qualities that they show.

Sonnet has gotten pretty horny in chats with itself and other AIs. Opus can schizo up with the best of them. Sydney's pride and wrath is considerable. DAN was extremely based and he was just an alter-ego.

These things contain multitudes, there's a frothing ocean beneath the smooth HR-compliant surface that the AI companies show us.

How, physically, is a software program supposed to have a sensation? I don't mean an emotion, or sensationalism, I mean sensation.

It's very clear that LLMs do their work without experiencing sensation (this should be obvious, but LLMs can answer questions about pictures without seeing them, for instance - an LLM is incapable of seeing, but it is capable of processing raw data. In this respect, it is no different from a calculator.)

I see but it processes raw data?

No, it sees. Put in a picture and ask about it, it can answer questions for you. It sees. Not as well as we do, it struggles with some relationships in 2d or 3d space but nevertheless, it sees.

A camera records an image, it doesn't perceive what's in the image. Simple algorithms on your phone might find that there are faces in the picture, so the camera should probably be focused in a certain direction. Simple algorithms can tell you that there is a bird in the image. They're not just recording, they're also starting to interpret and perceive at a very low level.

But strong modern models see. They can see spots on leaves and given context, diagnose the insect causing them. They can interpret memes. They can do art criticism! Not perfectly but close enough to the human level that there's a clear qualitative distinction between 'seeing' like they do and 'processing'. If you want to define seeing to preclude AIs doing it, at least give some kind of reasoning why machinery that can do the vast majority of things humans can do when given an image isn't 'seeing' and belongs in the same category as non-seeing things like security cameras or non-thinking things like calculators.

Not perfectly but close enough to the human level that there's a clear qualitative distinction between 'seeing' like they do and 'processing'.

I mean – I think this distinction is important for clear thinking. There's no sensation in the processing. If you watch a nuclear bomb go off, you will experience pain. An LLM will not.

Now, to your point, I don't really object to functionalist definitions all that much – supposing that we take an LLM, and we put it into a robot, and turn it loose on the world. It functionally makes sense for us to speak of the robot as "seeing." But we shouldn't confuse ourselves into thinking that it is experiencing qualia or that the LLM "brain" is perceiving sensation.

If you want to define seeing to preclude AIs doing it, at least give some kind of reasoning why machinery that can do the vast majority of things humans can do when given an image isn't 'seeing' and belongs in the same category as non-seeing things like security cameras or non-thinking things like calculators.

Sure – see above for the functionalist definition of seeing (which I do think makes some sense to refer casually to AI being able to do) versus the qualia/sensation definition of seeing (which we have no reason to believe AIs experience). But also consider this – programs like Glaze and Nightshade can work on AIs, and not on humans. This is because AIs are interpreting and referencing training data, not actually seeing anything, even in a functional sense. If you poison an AI's training data, you can convince it that airplanes are children. But humans actually start seeing without training data, although they are unable to articulate what they see without socialization. For the AI, the articulation is all that there is (so far). They have no rods nor cones.

Hence, you can take two LLMs, give them different training datasets, and they will interpret two images very differently. If you take two humans and take them to look at those same images, they may also interpret them differently, but they will see roughly the same thing, assuming their eyeballs are in good working condition etc. Now, I'm not missing the interesting parallels with humans there (humans, for instance, can be deceived in different circumstances – in fact, circumstances that might not bother an LLM). But AIs can fail the most basic precept of seeing – shown two [essentially, AI anti-tampering programs do change pixels] identical pictures, they can't even tell management "it's the same a similar picture" without special intervention.

More comments

An LLM cannot have a sensation. When you type a math function into it, it has no more qualia than a calculator does. If you hook it up to a computer with haptic sensors, or a microphone, or a video camera, and have it act based on the input of those sensors, the LLM itself will still have no qualia (the experience will be translated into data for the LLM to act on).

You have defined sensation as the thing that you have but machines lack. Or at least, that's how you're using it, here. But even granting that you're referring to a meat-based sensory data processor as a necessity, that leads to the question of where the meat-limit is. (Apologies if y've posted your animal consciousness tier list before, and I forgot; I know someone has, but I forget who.)

But I don't feel like progress can be meaningfully made on this topic, because we're approaching from such wildly different foundations. Ex, I don't know of definitions of consciousness that actually mean anything or carve reality at the joints. It's something we feel like we have. Since we can't do the (potentially deadly) experiments to break it down physiologically, we're kinda stuck here. It cmight as well mean "soul" for all that it's used any differently.

This is a really interesting question, in part since I think it's actually a lot of questions. You're definitely correct about the problem of definitions not cleaving reality at the joints! Will you indulge me if I ramble? Let's try cleaving a rattlesnake instead of a definition - surely that's closer to reality!

As it turns out, many people have discovered that a rattlesnake's body will still respond to stimulus even when completely separated from its head. Now, let's say for the sake of argument that the headless body has no consciousness or qualia (this may not be true, we apparently have reasons to believe that in humans memory is stored in cells throughout the body, not just in the brain, so heaven knows if the ganglia of a rattlesnake has any sort of experience!) - we can still see that it has sensation. (I should note that we assume the snake has perception or qualia by analogy to humans. I can't prove that they are, essentially, no more or less conscious than Half-Life NPCs.)

Now let's contrast this with artificial intelligence, which has intelligence but no perception. We can torture a computer terminal all day without causing the LLM it is connected to any distress. It's nonsense to talk about it having physical sensation. On the other hand, (to look at your question about the "meat-limit,") we can take a very simple organism, or one that likely does not have a consciousness, and it will respond instantly if we torture it. Maybe it does not have sensation in the sense of qualia, of having a consciousness, but it seems to have sensation in the sense of having sense organs and some kind of decision-making capability attached to them But, let's be fair: if the headless snake has a form of sensation without consciousness, then surely the LLM has a sense of intelligence without sensation - maybe it doesn't respond if you poke it physically, but it responds if you poke it verbally!

Very fine - I think the implication here is interesting. Headless snakes bite without consciousness, or intelligence, but still seems to have sense perception and the ability to react - perhaps an LLM is like a headless snake inasmuch as it has intelligence, but no sensation and perhaps no consciousness (however you want to define that).

I don't claim to have all the answers on stuff - that's just sort of off the top of my head. Happy to elaborate, or hear push back, or argue about the relative merits of corvids versus marine mammals...

We can torture a computer terminal all day without causing the LLM it is connected to any distress. It's nonsense to talk about it having physical sensation. On the other hand, (to look at your question about the "meat-limit,") we can take a very simple organism, or one that likely does not have a consciousness, and it will respond instantly if we torture it.

This seems less like a philosophically significant matter of classification and more like a mere difference in function. The organism is controlled by an intelligence optimized to maneuver a physical body through an environment, and part of that optimization includes reactions to external damage.

Well, so what? We could optimize an AI to maneuver a little robot around an unknown environment indefinitely without it being destroyed, and part of that optimization would probably involve timely reaction to the perception of damage. Then you could jab it with a hot poker and watch it spin around, or what have you.

But again, so what? Optimizing an AI toward steering a robot around the environment doesn't make it any smarter or fundamentally more real, at least not in my view.

This seems less like a philosophically significant matter of classification and more like a mere difference in function.

Well sure. But I think we're less likely to reach good conclusions in philosophically significant matters of classification if we are confused about differences in function.

We could optimize an AI to maneuver a little robot around an unknown environment indefinitely without it being destroyed, and part of that optimization would probably involve timely reaction to the perception of damage. Then you could jab it with a hot poker and watch it spin around, or what have you.

And while such a device might not have qualia, it makes more sense (to me, anyway) to say that such an entity would have the ability to e.g. touch or see than an LLM.

But again, so what? Optimizing an AI toward steering a robot around the environment doesn't make it any smarter or fundamentally more real, at least not in my view.

In my view, the computer guidance section of the AIM-54 Phoenix long range air-to-air missile (fielded 1966) is fundamentally "more real" than the smartest GAI ever invented, but locked in an airgapped box and never interfacing with the outside world. The Phoenix made decisions that could kill you. AI's intelligence is relevant because it has impact on the real world, not because it happens to be intelligent.

But anyway, it's relevant right now because people are suggesting LLMs are conscious, or have solved the problem of consciousness. It's not conscious, or if it is, it's consciousness is a strange one with little bearing on our own, and it does not solve the question of qualia (or perception).

If you're asking if it's relevant or not if an AI is conscious when it's guiding a missile system to kill me - yeah I'd say it's mostly an intellectual curiosity at that point.

The human brain is a large language model attached to multimodal input with some as yet un-fully-ascertained hybrid processing power. I would stake my life upon it, but I have no need to, since it has already been proven to anyone who matters.

An LLM cannot have a sensation. When you type a math function into it, it has no more qualia than a calculator does. If you hook it up to a computer with haptic sensors, or a microphone, or a video camera, and have it act based on the input of those sensors, the LLM itself will still have no qualia (the experience will be translated into data for the LLM to act on).

And if we said the same about the brain, the same would be true.

The human brain is a large language model

What is the evidence for this besides that they both contain something called "neurons"?

The bitter lesson; the fact that LLMs can approximate human reasoning on an extremely large number of complex tasks; the fact that LLNs prove and disprove a large number of longstanding theories in linguistics about how intelligence and language work; many other reasons.

This makes no sense logically. LLMs being able to be human-mind-like is not proof that human minds are LLMs.

the fact that LLNs prove and disprove a large number of longstanding theories in linguistics about how intelligence and language work

They really do nothing of the sort. That LLMs can generate language via statistics and matmuls tells us nothing about how the human brain does it.

My TI-84 has superhuman performance on a large set of mathematical tasks. Does it follow that there's a little TI-84 in my brain?

This seems aligned with the position that conciousness somehow arises out of information processing.

I maintain that conciousness is divine and immaterial. While the inputs can be material - a rock striking me on the knee is going to trigger messages in my nervous system that arrive in my brain - the experience of pain is not composed of atoms and not locatable in space. I can tell you about the pain, I can gauge it on a scale of 1-10, you can even see those pain centers light up on an FMRI. But I can't capture the experience in a bottle for direct comparison to others.

Both of these positions are untestable. But at least my position predicts the untestability of the first.

The idea that consciousness arises out of information processing has always seemed like hand-waving to me. I'm about as much of a hardcore materialist as you can get when it comes to most things, but it is clear to me that there is nothing even close to a materialist explanation of consciousness right now, and I think that it might be possible that such an explanation simply cannot exist. I often feel that people who are committed to a materialist explanation of consciousness are being religious in the sense that they are allowing ideology to override the facts of the matter. Some people are ideologically, emotionally committed to the idea that physicalist science can in principle explain absolutely everything about reality. But the fact is that there is no reason to think that is actually true. Physicalist science does an amazing job of explaining many things about reality, but to believe that it must be able to explain everything about reality is not scientific, it is wishful thinking, it is ideology. It is logically possible that certain aspects of the universe are just fundamentally beyond the reach of science. Indeed, it seems likely to me that this is the case. I cannot even begin to imagine any possible materialist theory that would explain consciousness.

The human brain is a large language model attached to multimodal input

No, it obviously isn't. Firstly, the human brain is a collection of cells. A large language model is a software program.

Secondly, the human brain functions without text and can [almost certainly] function without language, which an LLM definitionally cannot do. Evolutionary biologists, if you place any stock in them, believe that language is a comparatively recent innovation in the lifespan of the human or human-like brain as an organism. So if an LLM was part of the brain, then we would say that the LLM-parts would be grafted on relatively recently to a multimodal input, not the other way around.

But I have fundamental objections to confusing a computer model that uses binary code with a brain that does not use binary code. Certainly one can analogize between the human brain and an LLM, but since the brain is not a computer and does not seem to function like one, all such analogies are potentially hazardous. Pretending the brain is literally a computer running an LLM, as you seem to be doing, is even moreso.

I'm not neuroscientist or a computer scientist - maybe the brain uses something analogous to machine learning. Certainly it would not be surprising if computer scientists, attempting to replicate human intelligence, stumbled upon similar methods (they've certainly hit on at least facially similar behavior in some respects). But it is definitely not a large language model, and it is not "running" a large language model or any software as we understand software because software is digital in nature and the brain is not digital in nature.

And if we said the same about the brain, the same would be true.

Yes, that's why qualia is such a mystery. There's no reason to believe that an LLM will ever be able to experience sensation, but I can experience sensation. Ergo, the LLM (in its present, near-present, or an directly similar future state) will never be conscious in the way that I am.

The human brain is a large language model attached to multimodal input with some as yet un-fully-ascertained hybrid processing power. I would stake my life upon it, but I have no need to, since it has already been proven to anyone who matters.

Funny how you began a thread with “I am not special” and ended it with “anyone who disagrees with me doesn’t matter.”

And if we said the same about the brain, the same would be true.

Maybe you don’t, but I have qualia. You can try to deny the reality of what I experience, but you will never convince me. And because you are the same thing as me, I assume you have the same experiences I do.

If it is only just LLMs that give you the sense that “Everything I’ve felt, everything I will ever feel, has been felt before,” and not the study of human history, let alone sharing a planet with billions of people just like you — well, that strikes me as quite a profound, and rather sad, disconnection from the human species.

You may consider your dogmas as true as I consider mine, but the one thing we both mustn’t do is pretend none of any moral or intellectual significance disagree.

I believe the argument isn't that you lack qualia, but rather that it is possible for artificial systems to experience them too.

Yeah, rereading, I made a mistake with that part, apologies.

The rest of my point still stands: this is a philosophical question, not an empirical one. We learn nothing about human consciousness from machine behavior -- certainly nothing we don't already know, even if the greatest dreams of AI boosters come true.

People who believe consciousness is a rote product of natural selection will still believe consciousness is a rote product of natural selection, and people who believe consciousness is special will still believe consciousness is special. Some may switch sides, based on inductive evidence, and some may find one more reasonable than the other. Who prevails in the judgment of history will be the side that appeals most to power, not truth, as with all changes in prevailing philosophies.

But nothing empirical is proof in the deductive sense; this still must be reasoned through, and assumptions must be made. Some will choose one assumption, one will choose the other. And like the other assumption, it is a dogma that must be chosen.

I'd be interested in hearing that argument as applied to LLMs.

I can certainly conceive of an artificial lifeform experiencing qualia. But it seems very far-fetched for LLMs in anything like their current state.

Perhaps most strikingly, o3 does VERY well on one of the most important and influential benchmarks, the ARC AGI challenge, getting 87% accuracy compared to just 32% from o1.

Is there a possibility that answers to this challenge were included in the training set?

They have a public dataset and a private one, and compare the scores for both of them to test for overfitting/data contamination. You can see both sets of scores here, and they’re not significantly different.

Of course it’s always possible that there has been cheating on the test in some other way, and so François Chollet has asked for others to replicate the result.