@SnapDragon's banner p

SnapDragon


				

				

				
0 followers   follows 0 users  
joined 2022 October 10 20:44:11 UTC
Verified Email

				

User ID: 1550

SnapDragon


				
				
				

				
0 followers   follows 0 users   joined 2022 October 10 20:44:11 UTC

					

No bio...


					

User ID: 1550

Verified Email

I'm inclined towards your skeptical take - I think we as humans always fantasize that there are powerful people/beings out there who want to spend resources hurting us, when the real truth is that they simply don't care about you. Sure, the denizens of the future with access to your brainscan could simulate your mind for a billion subjective years without your consent. But why would they?

The problem is that there's always a risk that you're wrong, that there is some reason or motive in post-singularity society for people to irreversibly propagate your brainscan without your consent. And then you're at the mercy of Deep Time - you'd better hope that no beings that ever will exist will enjoy, uh, "playing" with your mind. (From this perspective, you won't even have the benefit of anonymity - as one of the earliest existing minds, it's easy to imagine some beings would find you "interesting".)

Maybe the risk is low, because this is the real world we're dealing with and it's never as good or bad as our imaginations can conjure. But you're talking about taking a (small, you argue) gamble with an almost unlimited downside. Imagine you had a nice comfortable house that just happened to be 100m away from a hellmouth. It's inactive, and there are guard rails, so it's hard to imagine you'd ever fall in. But unlikely things sometimes happen, and if you ever did, you would infinitely regret it forever. I don't think I'd want to live in that house! I'd probably move...

Listen, I did not intentionally trap those Sims in their living room. The placement of the stove was an innocent mistake. That fire could have happened anywhere! A terrible tragedy.

You know, sometimes pools just accidentally lose their exit. Common engineering mishap. My sincere condolences to those affected.

There's also the concern of what kind of suffering a post-singularity society can theoretically enable; it might go far, far beyond what anyone on Earth has experienced so far (in the same way that a rocket flying to the moon goes farther than a human jumping). Is a Universe where 99.999% of beings live sublime experiences but the other 0.001% end up in Ultrahell one that morally should exist?

Remember, in the game of chess you can never let your adversary see your pieces.

Well, I don't think your analogy of the Turing Test to a test for general intelligence is a good one. The reason the Turing Test is so popular is that it's a nice, objective, pass-or-fail test. Which makes it easy to apply - even if it's understood that it isn't perfectly correlated with AGI. (If you take HAL and force it to output a modem sound after every sentence it speaks, it fails the Turing Test every time, but that has nothing to do with its intelligence.)

Unfortunately we just don't have any simple definition or test for "general intelligence". You can't just ask questions across a variety of fields and declare "not intelligent" as soon as it fails one (or else humans would fail as soon as you asked them to rotate an 8-dimensional object in their head). I do agree that a proper test requires that we dynamically change the questions (so you can't just fit the AI to the test). But I think that, unavoidably, the test is going to boil down to a wishy-washy preponderance-of-evidence kind of thing. Hence everyone has their own vague definition of what "AGI" means to them; honestly, I'm fine with saying we're not there yet, but I'm also fine arguing that ChatGPT already satisfies it.

There are plenty of dynamic, "general", never-before-seen questions you can ask where ChatGPT does just fine! I do it all the time. The cherrypicking I'm referring to is, for example, the "how many Rs in strawberry" question, which is easy for us and hard for LLMs because of how they see tokens (and, also, I think humans are better at subitizing than LLMs). The fact that LLMs often get this wrong is a mark against them, but it's not iron-clad "proof" that they're not generally intelligent. (The channel AI Explained has a "Simple Bench" that I also don't really consider a proper test of AGI, because it's full of questions that are easy if you have embodied experience as a human. LLMs obviously do not.)

In the movie Phenomenon, rapidly listing mammals from A-Z is considered a sign of extreme intelligence. I can't do it without serious thought. ChatGPT does it instantly. In Bizarro ChatGPT world, somebody could write a cherrypicked blog post about how I do not have general intelligence.

FWIW, I appreciate this reply, and I'm sorry for persistently dogpiling you. We disagree (and I wrongly thought you weren't arguing in good faith), but I definitely could have done a better job of keeping it friendly. Thank you for your perspective.

Most frustratingly, the things that I actually need help on, the ones where I don't know really anything about the topic and a workable AI assistant would actually save me a ton of time, are precisely the cases where it fails hard (as in my examples where stuff doesn't even work at all).

That does sound like a real Catch-22. My queries are typically in C++/Rust/Python, which the models know backwards, forwards, and sideways. I can believe that there's still a real limit to how much an LLM can "learn" a new language/schema/API just by dumping docs into the prompt. (And I don't know anything about OpenAI's custom models, but I suspect they're just manipulating the prompt, not using RL.) And when an LLM doesn't know how to do something, there's a risk it will fake it (hallucinate). We're agreed there.

Maybe using the best models would help. Or maybe, given the speed things are improving, just try again next year. :)

What the hell? You most definitely did NOT give any evidence then. Nor in our first argument. I'm not asking so I can nitpick. I would genuinely like to see a somewhat-compact example of a modern LLM failing at code in a way that we both, as programmers, can agree "sucks".

Isn't there a case to be made for an exception here? It's not some cheap "gotcha", there's an actual relevant point to be made when you fail to spot the AI paragraph without knowing you're being tested on it. The fact that @self_made_human did catch it is interesting data! To me, it's similar to when Scott would post "the the" (broken by line breaks) at random to see who could spot it.

Right, and I asked you for evidence last time too. Is that an unreasonable request? This isn't some ephemeral value judgement we're debating; your factual claims are in direct contradiction to my experience.

Please post an example of what you claim is a "routine" failure by a modern model (2.5 Pro, o3, Claude 3.7 Sonnet). This should be easy! I want to understand how you could possibly know how to program and still believe what you're writing (unless you're just a troll, sigh).

There are plenty of tasks (e.g. speaking multiple languages) where ChatGPT exceeds the top human, too. Given how much cherrypicking the "AI is overhyped" people do, it really seems like we've actually redefined AGI to "can exceed the top human at EVERY task", which is kind of ridiculous. There's a reasonable argument that even lowly ChatGPT 3.0 was our first encounter with "general" AI, after all. You can have "general" intelligence and still, you know, fail at things. See: humans.

I'm also not allowed to use the best models for my job, so take my advice (and, well, anyone else's) with a grain of salt. Any advice you get might be outdated in 6 months anyway; the field is evolving rapidly.

I think getting AI help with a large code base is still an open problem. Context windows keep growing, but (IMO) the model isn't going to get a deep understanding of a large project just from pasting it into the prompt. Keep to smaller components; give it the relevant source files, and also lots of English context (like the headers/docs you mentioned). You can ask it design questions (like "what data structure should I use here?"), or for code reviews, or have it implement new features. (I'm not sure about large refactors - that seems risky to me, because the model's temperature could make it randomly change code that it shouldn't. Stick to output at a scale that you can personally review.)

The most important thing to remember is that an LLM's superpower is comprehension: describe what you want in the same way you would to a fellow employee, and it will always understand. It's not some weird new IDE with cryptic key commands you have to memorize. It's a tool you can (and should) talk to normally.

Let me join the chorus of voices enthusiastically agreeing with you about how jobs are already bullshit. I've never been quite sure whether this maximally cynical view is true, but it sure feels true. One white-collar worker has 10x more power to, well, do stuff than 100 years ago, but somehow we keep finding things for them to do. And so Elon can fire 80% of Twitter staff, and "somehow" Twitter continues to function normally.

With that said, I worry that this is a metastable state. Witness how thoroughly the zeitgeist of work changed after COVID - all of a sudden, in my (bullshit white-collar) industry, it's just the norm to WFH and maybe grudgingly come in to the office for a few hours 3 days a week. Prior to 2020, it was very hard to get a company to agree to let you WFH even one day a week, because they knew you'd probably spend the time much less productively. Again, "somehow" the real work that was out there still seems to get done.

If AI makes it more and more obvious that office work is now mostly just adult daycare, that lowers the transition energy even more. And we might just be waiting for another sudden societal shock to get us over that hump, and transition to a world where 95% of people are unemployed and this is considered normal. We're heading into uncharted waters.

Well, based on what I know of the Canadian indigenous peoples (who the current PC treadmill calls the "First Nations"), there's a lot of crime, misery, and unrest as a result. But hey, people addicted to videogames are less destructive than people addicted to alcohol, so we'll see.

(Also, I really don't expect to see decent neuralink tech by 2030. It's just too damn hard.)

Definitely an important point. I agree that there is a real possibility of societal breakdown under those kinds of conditions. Hopefully, even if UBI efforts never go anywhere, people will still somehow scrounge up enough to eat and immerse themselves in videogames. (We're kind of halfway there today, to be honest, judging from most of my friends.) Somehow society and the economy survived the insane COVID shutdowns (surprising me). I have hope they'll be resilient enough to survive this too. But there's no historical precedent we can point to...

Any particular reason why you're optimistic? What are your priors in regards to AI?

Same as you, I don't pretend to be able to predict an unknowable technological future and am just relying on vibes, so I'm not sure I can satisfy you with a precise answer...? I outlined why I'm so impressed with even current-day AI here. I've been working in AI-adjacent fields for 25 years, and LLMs are truly incomparable in generality to anything that has come before. AGI feels close (depending on your definition), and ASI doesn't, because most of our gains are just coming from scaling compute up (with logarithmic benefits). So it's not an existential threat, it's just a brand new transformative tech, and historically that almost always leads to a richer society.

You don't tend to get negative effects on the tech tree in a 4X game, after all. :)

Yeah, it's a good question that I can't answer. I suspect if all humans somehow held to a (not perfect but decent) standard of not driving impaired or distracted, signaling properly, and driving the speed limit or even slower in dangerous conditions ... that would probably decrease accidents by at least 80% too. So maybe self-driving cars are still worse than that.

To quote Hawkeye/Ronin: "Don't give me hope."

Waymo has a lot of data, and claims a 60-80% reduction in accidents per mile for self-driving cars. You should take it with a grain of salt, of course, but I think there are people holding them to a decent reporting standard. The real point is that even being 5x safer might not be enough for the public. Same with having an AI parse regulations/laws...

Fantastic post, thanks! Lots of stuff in there that I can agree with, though I'm a lot more optimistic than you. Those 3 questions are well stated and help to clarify points of disagreement, but (as always) reality probably doesn't factor so cleanly.

I really think almost all the meat lies in Question 1. You're joking a little with the "line goes to infinity" argument, but I think almost everyone reasonable agrees that near-future AI will plateau somehow, but there's a world of difference in where it plateaus. If it goes to ASI (say, 10x smarter than a human or better), then fine, we can argue about questions 2 and 3 (though I know this is where doomers love spending their time). Admittedly, it IS kind of wild that this this a tech where we can seriously talk about singularity and extinction as potential outcomes with actual percentage probabilities. That certainly didn't happen with the cotton gin.

There's just so much space between "as important as the smartphone" -> "as important as the internet" (which I am pretty convinced is the baseline, given current AI capabilities) -> "as important as the industrial revolution" -> "transcending physical needs". I think there's a real motte/bailey in effect, where skeptics will say "current AIs suck and will never get good enough to replace even 10% of human intellectual labour" (bailey), but when challenged with data and benchmarks, will retreat to "AIs becoming gods is sci-fi nonsense" (motte). And I think you're mixing the two somewhat, talking about AIs just becoming Very Good in the same paragraph as superintelligences consuming galaxies.

I'm not even certain assigning percentages to predictions like this really makes much sense, but just based on my interactions with LLMs, my good understanding of the tech behind them, and my experience using them at work, here are my thoughts on what the world looks like in 2030:

  • 2%: LLMs really turn out to be overhyped, attempts at getting useful work out of them have sputtered out, I have egg all over my face.
  • 18%: ChatGPT o3 turns out to be roughly at the plateau of LLM intelligence. Open-Source has caught up, the models are all 1000x cheaper to use due to chip improvements, but hallucinations and lack of common sense are still a fundamental flaw in how the LLM algorithms work. LLMs are the next Google - humans can't imagine doing stuff without a better-than-Star-Trek encyclopedic assistant available to them at all times.
  • 30%: LLMs plateau at roughly human-level reasoning and superhuman knowledge. A huge amount of work at companies is being done by LLMs (or whatever their descendant is called), but humans remain employed. The work the humans do is even more bullshit than the current status quo, but society is still structured around humans "pretending to work" and is slow to change. This is the result of "Nothing Ever Happens" colliding with a transformative technology. It really sucks for people who don't get the useless college credentials to get in the door to the useless jobs, though.
  • 40%: LLMs are just better than humans. We're in the middle of a massive realignment of almost all industries; most companies have catastrophically downsized their white-collar jobs, and embodied robots/self-driving cars are doing a great deal of blue-collar work too. A historically unprecedented number of humans are unemployable, economically useless. UBI is the biggest political issue in the world. But at least entertainment will be insanely abundant, with Hollywood-level movies and AAA-level videogames being as easy to make as Royal Road novels are now.
  • 9.5%: AI recursively accelerates AI research without hitting engineering bottlenecks (a la "AI 2027"), ASI is the new reality for us. The singularity is either here or visibly coming. Might be utopian, might be dystopian, but it's inescapable.
  • 0.5%: Yudkowsky turns out to be right (mostly by accident, because LLMs resemble the AI in his writings about as closely as they resemble Asimov's robots). We're all dead.

As far as I can tell, the vast vast VAST majority of it is slop full of repurposed music and lyrics that get by on being offensive rather than clever. Rap artists aren't known for being intelligent, after all. I suspect most "celebrated" rap music would fail a double-blind test against some rando writing parodic lyrics and banging on an audio synthesizer for a few hours. Much like postmodern art, where the janitor can't tell it apart from trash.

There probably are some examples of the genre that I could learn to appreciate (Epic Rap Battles of History comes to mind), but it's hard to find them because of the pomo effect.

Sounds like you have some practical experience here. Yeah, if just iterating doesn't help and a human has to step in to "fix" the output, then at least there'll be some effort required to bring an AI novel to market. But it does feel like detectors (even the good non-Grammarly ones) are the underdogs fighting a doomed battle.

Indeed, one of the fundamental conjectures in CS, "P != NP", can be somewhat rephrased as "it's easier to check an answer than to produce it". I think it's actually something of an optimistic view of the future that most things will end up produced with generative AI, but humans will still have a useful role in checking its work.

The thing is, the people producing the novel can use the detectors too, and iterate until the signal goes away. I have a friend who is taking some college courses that require essays, and they're explicitly told that Grammarly must not flag the essay as AI-written. Unfortunately (and somewhat amusingly), the detector sucks, and her normal writing is flagged as AI-written all the time - she has to rewrite it in a more awkward manner to get the metric below the threshold. Similarly, I imagine any given GPT detector could be defeated by just hooking it up to the GPT in a negative-feedback loop.

And interestingly, one of Douglas Adams' last projects was Starship Titanic, which was arguably the first game to have "chatbot" NPCs - in 1998! Obviously they didn't work very well, but the game's ambition was very ahead of its time.