This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
OpenAI Shifts Strategy to Slower, Smarter AI as GPT Scaling Limits Emerge, OpenAI's upcoming Orion model shows how GPT improvements are slowing down
Paywalled, but here's a summary from reddit:
This is one of several articles/posts/tweets coming out of the LLMsphere over the past couple of weeks that are renewing concerns over LLMs hitting diminishing returns.
Of course this is just speculation until OpenAI actually releases Orion (or whatever they end up calling it). And really we would need several models past Orion too to actually extrapolate a pattern. But this does fit with my subjective impression that the leap from GPT-3 to GPT-4 was not as big as the leap from GPT-2 to GPT-3, and the leap from 4 to o1 was not as big as the leap from 3 to 4. The fact that they're considering again releasing a new model without calling it GPT-5 is also telling. They know how psychologically important the "GPT-5" moniker has become at this point and they won't give that name to a model unless it really represents a major leap forward.
A big problem is how you measure intelligence.
Any human can count the number of r's in strawberry but most would be hard-pressed to translate Chinese to English or do anything at all in python or other programming languages. Intelligence is multi-domain, possibly the most complicated thing we can try to measure.
Even within domains, how do you rate intelligence? Sometimes it would be worth spending 10x more to get a marginally better programming AI because 'marginally better' is like a 0 to 1 increase in that it provides genuinely useful input that a human can use to get a good answer and speed up their work.
A new brutally hard question set dropped a few days ago. I have no idea what any of this gibberish means, it's well beyond me.
https://epochai.org/frontiermath
The two toned-down o1 models get 1%, along with GPT-4. Claude 3.5 and Gemini Pro get 2%. Does it follow that Claude 3.5 new is smarter than I am since I would get 0 and that therefore AGI has been achieved? Probably not, Sonnet 3.5 makes all kinds of order of magnitude mistakes that I can eyeball as wrong. But it is pretty damn smart and noticeably smarter than its older incarnation, it has certain new tendencies in writing that qualitatively improve it.
GPT-2 would get 0 on nearly all benchmarks because it just babbles. GPT-3 would also get 0 because it just babbles (albeit more interestingly), remember these are the base models that might just answer your question with more questions. GPT-3.5 was the inflection point where AI became useful for a bunch of things, for consumer use. The old GPT-4 (as of March 2023) opened up coding. Opus 3 was thought to be the first really creative writer, it can maintain an engaging twitter persona. Sonnet 3.5 is on a whole new level in coding, opening up Cursor. The newest Sonnet, o1 and Gemini can start to barely grapple with these advanced mathematical questions.
From the perspective of 'can it answer Frontier Math', there have been no advances in AI before the last 6 months or so. Intelligence is so complex, what looks like a slowdown in one domain can just be the start of something new in another domain.
More options
Context Copy link
AI is still limited to text boxes and text manipulation or content generation; it has failed to live up the hype otherwise, like life extension, replacing workers, or treating disease, imho. The point of diminishing returns has been reached. it will take a whole new paradigm for AI to make the next leap. As far as transforming writing papers for college students, yeah, it has totally crushed that, and even then teachers are wising up. [If AI is able to produce a fiction novel that is a best-seller and or critically acclaimed either with text prompts or feeding it samples of other novels, I will be sold]
What? Of course it isn't.
AI is used all the time in a whole bunch of "invisible" applications that have nothing to do with text or content generation. Take a photo with a phone camera? You're using AI. Use Nvidia RTX voice? AI. Deal with pharmaceutical molecule research? Fair chance of AI being used. Play guitar and use the newest generation of amp modelers? That's AI again.
This is mostly because "AI" is a nonsense term. There are many different machine learning techniques being used in each of those applications and the fact that transformers have become somewhat general purpose doesn't change this.
But saying it's all AI is like saying it's all computers. It's missing the trees for the forest.
The tradeoffs and composability of these techniques are not uniform.
More options
Context Copy link
It's like saying that word processor spreadsheets can replace doing it by hand. It does not solve the spreadsheet problem, only makes it more efficient. Maybe the problem is me, but I am not seeing a big difference. I think the closest thing to truly transformational technology with direct, tangible real-world applications is printed buildings ( those cheap amazon.com homes that can be erected quickly), but this is not directly AI.
More options
Context Copy link
More options
Context Copy link
But all of these problems are reducible to text generation. In some sense every conceivable problem is, because solving the problem means writing out the solution, in language.
For “solving” medicine, just have the LLM print a formula for the drug you want. A lot of remote work just is text generation in a sense, but for physical labor, a sufficiently intelligent LLM would be able to accelerate progress in robotics significantly.
Whether LLMs can actually achieve these things though is an open question.
Too bad LLMs also only have access to solutions that were also previously written out, in languatge.
I mean, they are capable of "solving" novel problems not in their initial data. The problems just have to have the same "shape" as problems already in their training data.
And certain kinds of "language" type problems can be solved purely on the basis of the LLM's "knowledge" of English. Those problems aren't necessarily super hard for a human to do, but could save time on tasks like that.
Kinda sorta I guess?
For the examples given, if you ask an LLM to "print a formula for a drug you want", it will print something that looks like a formula for drugs that it's seen -- not super useful, other than by 'infinite monkeys' means?
Not sure what he's getting at on robotics, but the 'talking about awesome robots' role does not seem to have any shortage of applicants. To be frank, it's bullshit other than for people with bullshit jobs who feel they should continue to be paid but not have to sully themselves by personally generating the bullshit.
(the PR people at my work are super interested in LLMs, for example -- like, your life is not meaningless enough banging out 500 word communiques, you need a machine to do that for you? I really don't know what else to say)
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
The big problem with medicine has always been testing. Human trials will always be expensive and time consuming
More options
Context Copy link
More options
Context Copy link
Tesla is all-in on reinforcement learning for their next generation of Optimus robots, but they only spun that team up this summer. When I heard this news the stock price was at like 180 and I bought some calls for 230/250/270 for next June. After some movement I pushed these up to 300. Yet this still looks way too pessimistic. I think some exposure to $500c by the end of next year might be warranted.
The TSLA call options so expensive though. I like the 2x leveraged TSLA ETF instead. if TSLA doubles the ETF in theory will gain 3.5-4x, maybe offset decay by selling a long-dated ATM put + call
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Me from a couple months ago...
More options
Context Copy link
Interesting stuff! Makes sense to me that Transformer architectures won't take us all the way to AGI, but I remain bullish on the prospects for AGI before 2030. ChatGPT released almost exactly 2 years ago, and its impacts won't be felt for years to come, especially in terms of the influx of human capital and investment in frontier capabilities it prompted. Millions of people are now working towards an AI career who weren't doing AI in November 2022 - a smart 18 year old freshman who was inspired by ChatGPT to switch from Physics or Engineering to Compsci would still be getting his college credits.
More options
Context Copy link
Speculation: It’s interesting that the bottleneck is given as lack of data rather than architecture. That opens up the possibility that we may be able to get things moving again by finding some other method of obtaining/creating useable data.
LLMs were historically created to use next-token-prediction as a means of solving natural language processing tasks. I think we can regard that problem as provisionally solved. When people talk about GPTs limits, they aren’t talking about its ability to take English input and produce readable English output. They are talking about general intelligence: the ability to output sensible, useful English output.
In short, LLMs are general learning machines using natural language as a proxy task. Natural language is cheap and information rich but any means of conveying information about the world is fair game, provided that it can be converted into the same token space that GPT is using using CLIP or something similar.
What is needed is large quantities of data that conveys causal information about the world. Video is probably a good place to start. Some kind of simulated self-play might also be useable. What else could be useable?
(I’m not sure how next-token prediction would work here)
It's not lack of data as such (there's gobs and gobs of raw data). It's curated high quality data of a suitable form and that can actually be used (be that for legal or technical reasons). The reason synthetic data is used because it solves (or claims to solve) the curation and form issues. The trainers can directly instruct the source AI to provide data of type X in quantity Y.
Indeed, but this introduces its own problems. This is arguably a large part of why Google's AI products are noticeably more prone to "hallucination" than thier immediate peers.
Of course. I'd estimate that using synthetic data results in an overall worse performing AI but I could see it being used to fill specific gaps in the real training material (probably using a specialized model that's good at that specific type of data and possibly not much else).
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Somewhat speculative, but non-invasive recording of brain activity seems like a promising underutilized modality. When sufficiently discreet devices reach the market -- say, for controlling your phone -- they would be worn anyway, continuously throughout the day, so just add a few more lines about personal data collection in a license agreement. To get labeled data, make an app which prompts humans with various signals and records their reaction. Gamify, pay if needed, etc. Seems scalable.
More options
Context Copy link
In effect LLMS aren't smart, they are just great at recognizing patterns they are trained on. Google is great at recognizing text strings that it remembers, LLMS don't need matching strings they match on patterns and are able to combine patterns from multiple sources. LLMs aren't truly intelligent because they are dumbfounded if there isn't a good matching pattern in the training set. They are stumped in a way a human isn't if they encounter something new.
LLMs aren't going to replace humans because the set of all data is miniscule to the set of all potential patterns in the world.
I mean, you can say LLMs aren't going to replace humans...but the 'potential patterns in the world' are all reducible to data in one way or another.
So some Machine trained on language AND physics data AND biology AND etc. etc. is still a potential contender, no?
I mean is it? Quantitative Realism doesn't exactly seem self evident.
I've consistently pointed AI hype believers to their own metaphysical assumptions and this is the crux of it.
Are we just pattern matching engines or does agency have another source and is that in anyway connected to our experience of consciousness?
I think when people believed that larger gizmoes we don't fully understand would give us the answer to this question, they were deluding themselves, and I'm somewhat dissapointed that I was right since we are still without answers. But at least the possibility that we have a soul, ghost or another manner of special thing that automata don't is still secure.
Now the real test will be this: if Musk can convince enough people to use Neuralink and get their brain patterns recorded 24/7, and if someone trains transformers on that, what will be the outcome? Can we Chinese room our way to general intelligence?
I don't know, but it seems like the most logical way forward, since access to immense unpolluted datasets is no longer a possibility.
Isn't Computational Complexity Theory supposed to tackle questions of this kind?
Scott Aaronson offered the following highly evocative metaphor:
Although I doubt such general questions and theories are that helpful in guiding our research: they provide boundaries for what is possible, but what is practical typically lies far away from those boundaries.
Scott's metaphor is funny to think about but it has no philosophical rigor.
Complexity theory is not meaningfully different from other mathematics in its relationship to the metaphysical: it's a pure reason construct that attempts to map out necessary truths.
In many ways it it actually completely disconnected from the question at hand, because the machines it is concerned about are abstractions that are not and cannot possibly be real. They just happen to map onto real objects in a useful enough way. As you point out.
Scott isn't the first to connect this type of endeavor and the sacred. Pythagoras did it a long time ago. But the connection isn't relevant to the question of intelligence in my view.
More options
Context Copy link
More options
Context Copy link
I think pure data is a directionally useful way of looking at the world, and useful for most problem-solving purposes. I am a theist so I think there’s more beyond just physical reality, but whether or not it’s true, I think that for most projects, reducing the universe to data is going to work just fine. Consciousness is produced in the brain, and definitely experienced there, so I think you can get something like a conscious AI simply by recreating a brain. Might be easier to start with a dog or something like that, but I think even though there’s a metaphysical aspect to consciousness, that doesn’t mean that there’s no point to studying it in brains.
More options
Context Copy link
More options
Context Copy link
I suppose it comes down to whether or not there is a ghost in the machine.
If human intelligence is all neurons that can be modeled as a graph with weighted edges then we should be able to simulate it.
Maybe we do that and still can’t get human intelligence to pop out of the simulated brain and find that something is missing.
It would be a bit funny if they design a machine that is provably a 1:1 simulation of a human brain, switch it on, and get an error message to the effect of "Cannot Execute Commands: This unit is not ensouled."
“Humunculus not installed: please refer to manual.”
More options
Context Copy link
I mean, that kind of sounds like you're saying it's provably not a 1:1 simulation of a human brain.
What you're describing is measurable evidence of new physics. Every physicist in the world would want to buy you a beer.
More options
Context Copy link
...Please contact your local soul provider for further information
If you believe that you've received this message in error, please contact your system administrator at t0.yahweh.root.
More options
Context Copy link
More options
Context Copy link
Control thread not found.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Finally, a reason for MMOGs to come back to the mainstream as data production interfaces.
That EvE online cell structure minigame was ahead of its time.
More options
Context Copy link
We’re exhausting almost all the data, video included. We’ve recently taken to generating synthetic data. For images, this would mean generating novel images and then feeding them back into training. Imagine taking an image of a car and then rotating it behind some thick leaves or a chain link fence.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link