faul_sname
Fuck around once, find out once. Do it again, now it's science.
No bio...
User ID: 884
My read on that paper is that it says
- The mechanism behind neural synchronization across distances in the brain is not fully understood
- Maybe that's because quantum effects play a significant role
- Here's a theoretical quantum mechanism that could potentially operate in myelinated axons
- Our mathematical model suggests this mechanism is physically possible under certain conditions
- The myelin sheath structure might provide sufficient isolation for quantum effects to persist despite body temperature
- We have no empirical evidence confirming or even suggesting that this actually happens in real neurons
I might find this study convincing if it was presented alongside an experiment where e.g. scientists slowly removed the insulating myelin coating from a single long nerve cell in a worm and watched what happened to the timing of signals across the brain. I'd expect the signals between distant parts of the brain not to stay synchronized as the myelin sheath degrades. If there's a sudden drop-off in synchronization at a specific thickness, rather than a gradual decline as the insulation thins, it might suggest quantum entanglement effects rather than just classical electrical conductivity changes.
In the absence of any empirical evidence like that I don’t find this paper convincing though.
I also don't think the paper authors were trying to convince readers that this is a thing that does happen in real neurons, just that further study is warranted.
Do you have a source on the quantum tunneling thing? That strikes me as wildly implausible.
It's unfortunate how strongly the chat interface has caught on over completion-style interfaces. The single most useful LLM tool I use on a daily basis is copilot. It's not useful because it's always right, it's useful because it's sometimes right, and when it's right it's right in about a second. When it's wrong, it's also wrong in about a second, and my brain goes "no that's wrong because X Y Z, it should be such and such instead" and then I can just write the correct thing. But the important thing is that copilot does not break my flow, while tabbing over to a chat interface takes me out of the flow.
I see no particular reason that a copilot for writing couldn't exist, but as far as I can tell it doesn't (unless you count something janky like loom).
But yeah, LLMs are great at the "babble" part of "babble-and-prune":
The stricter and stronger your Prune filter, the higher quality content you stand to produce. But one common bug is related to this: if the quality of your Babble is much lower than that of your Prune, you may end up with nothing to say. Everything you can imagine saying or writing sounds cringey or content-free. Ten minutes after the conversation moves on from that topic, your Babble generator finally returns that witty comeback you were looking for. You'll probably spend your entire evening waiting for an opportunity to force it back in.
And then instead of leveraging that we for whatever reason decided that the way we want to use these things is to train them to imitate professionals in a chat room who are writing with a completely different process (having access to tools which they use before responding, editing their writing before hitting "send", etc).
The "customer service AIs are terrible" thing is I think mostly a separate thing where customer service is a cost center and their goal is usually to make you go away without too much blowback to the business. AI makes it worse, though, because the executives trust an AI CS agent even less than they would trust a low-wage human in that position, and so will give that agent even fewer tools to actually solve your problem. I think the lack of trust makes sense, too, since you're not hiring a bunch of AI CS agents you can fire if they mess up consistently, you're "hiring" a bunch of instances of one agent, so any exploitability is repeatable.
All that said, I expect that for the near future LLMs will be more of a complement than a replacement for humans. But that's not as inspiring goal for the most ambitious AI researchers, and so I think they tend to cluster at companies with the stated goal of replacing humans. And over the much longer term it does seem unlikely that humans are at an optimal ability-to-do-useful-things-per-unit-energy point. So looking at the immediate evidence we see the top AI researchers are going all-in on replacing humans, and over the long term human replacement seems inevitable, and so it's easy to infer "oh the thing that will make humans obsolete is the thing that all these people talking about human obsolescence are working on".
You can choose not to forgive people for being wildly innumerate but you will, in doing so, condemn a substantial fraction of adults who function adequately on a day-to-day basis.
Superintelligences are going to be wayyyyyyyyy better at driving down those error bars and finding acceptable settlements. [...] I still don't see how a superintelligence doesn't reduce the bargaining friction.
I hope you're right about that. I worry that a lot of the dynamics around retaliation and precommitment are anti-inductive, and as such the difficulty of determining where the bright lines actually are scales with the sophistication of the actors. This would happen because a hostile actor will go right up to the line of "most aggressive behavior that will not result in retaliation" but not cross said line, so it becomes advantageous to be a little unclear about where that line is, and that lack of clarity will be calibrated to your adversaries not to some absolute baseline. And this is the sense in which I don't see a reachable point where honesty and bargaining come to strictly dominate.
As a note I do expect that bargaining frictions will be reduced, but the existential question is whether they will be reduced by a factor large enough to compensate for the increased destructiveness of a conflict that escalates out of control. Signs look hopeful so far but our sample size is still small. Certainly not a large enough sample size that I would conclude
The omniscient AIs will be able to plan everything out so far, so perfectly, that they will simply know what the result will be. Not necessarily all draws, but they'll know the expected outcome of war. And they'll know the costs. And they'll have no bargaining frictions in terms of uncertainties.
I don't think Yudkowsky would agree that he expects AI to be effectively omniscient in an absolute sense - relative to humans, sure, but that's a very different question. I do understand how it's possible to read him as saying that - a lot of the things he's written make more sense if his mental model is "an omniscient, reflexively consistent agent". However, I think that's because that represents the mental model he uses to think about AGI, rather than because that's something he expects to literally happen. In an interview a couple years ago he said
Planning is one way to succeed at search. I think for purposes of understanding alignment difficulty, you want to be thinking on the level of abstraction where you see that in some sense it is the search itself that is dangerous when it’s a strong enough search, rather than the danger seeming to come from details of the planning process.
One of my early experiences in successfully generalizing my notion of intelligence, what I’d later verbalize as “computationally efficient finding of actions that produce outcomes high in a preference ordering”, was in writing an (unpublished) story about time-travel in which the universe was globally consistent.
The requirement of global consistency, the way in which all events between Paradox start and Paradox finish had to map the Paradox’s initial conditions onto the endpoint that would go back and produce those exact initial conditions, ended up imposing strong complicated constraints on reality that the Paradox in effect had to navigate using its initial conditions. The time-traveler needed to end up going through certain particular experiences that would produce the state of mind in which he’d take the actions that would end up prodding his future self elsewhere into having those experiences.
The Paradox ended up killing the people who built the time machine, for example, because they would not otherwise have allowed that person to go back in time, or kept the temporal loop open that long for any other reason if they were still alive.
Just having two examples of strongly consequentialist general optimization in front of me – human intelligence, and evolutionary biology – hadn’t been enough for me to properly generalize over a notion of optimization. Having three examples of homework problems I’d worked – human intelligence, evolutionary biology, and the fictional Paradox – caused it to finally click for me.
which, when I read, it, was an "aha" moment where I understood why the stuff he wrote was Like That™ despite his insistence that people were misinterpreting his old writing.
This is where I'm appealing to things like the >90% draw rate in computer chess (when the starting positions are not specifically biased).
I think that's a fact particular to chess - I don't expect the same result in computer Go / othello / some other game that is less structurally prone to having draws.
We also see something similar in the main anti-inductive system that I'm making comparison to - financial markets. At one point, I had heard that an offhand estimate of how long a good trading idea lasts before it's discovered and proliferated is like 18 months. The models just keep getting better.
The models do keep getting better, but I don't see how improvement in those models means that there is a reachable point where winning strategies switch from being based on deception and trickery to being based on cooperation stemming from mutual knowledge of each others' strategies (here though I do expect Yud would take your side).
Agreed. Though I suspect progress on any concrete performance metric you care to predict will advance about as fast as you expect, and real-world practical uptake will be much slower than you expect (at least of you're Situational-Awareness-pilled, which is the vibe I get), because going from 5% to 95% on one of the benchmark tasks has limited practical value.
How is the alleged 100M model on RULER / NiaH? I find myself skeptical based on their blog post, the lack of concrete info therein, their choice to build a custom benchmark instead of using the industry standard ones (like RULER), their claims of having 100x'd context over publicly available SOTA, and their choice to name themselves "magic ai".
Your point does stand with gemini's 1M context window though - that one is the real deal, although the real killer will be a large-context reasoning model (without the ability to meaningfully process the things they retrieve from their context window, long-context LLMs don’t have much of an advantage over RAG).
Agreed, at least until the prices of solar panels and drones drop by a factor of 10-100.
the widely-held prediction that AI is going to become essentially omniscient.
Can you name three people who would agree that they make this "widely" held prediction? I know a lot who predict "much better than humans at making predictions", and quite a few who predict something like "that which cannot be predicted will be controlled", and a handful who predict galaxy-brained strange-loop reflective cooperation, but all of these fall quite a bit short of "essentially omniscient".
The typical analogy given for how perfectly efficient it will be as an executor, especially in comparison to humans, is to think about chess engines playing against Magnus Carlsen. The former is just so unthinkably better than the latter that it is effectively hopeless; the AI is effectively a perfect executor compared to us.
The important question is whether they are effectively perfect executors compared to each other. Humans are effectively perfect executors in social conflict when compared to chimpanzees, but we still fight against each other.
By my assessment, poorly. In the sense that we will all be poor if things continue this way.
I'm partial to chicken tikka masala
- Nacho fajitas are not authentic Mexican-Mexican but they are authentic American - Tejano/Tex-Mex to be specific.
- Quesadillas, if on corn tortillas with an appropriate cheese, are authentic. Fun note, in Mexico City you have to explicitly ask for queso in your quesadilla otherwise you'll get a tortilla with fillings but no cheese. This makes Mexicans from other parts of Mexico angry.
He also campaigned on making America great and prosperous.
Messing up the international relations that have led to 80 years with no use of nuclear weapons in combat has the potential to kill orders of magnitude more people than "hundreds" (and yes that's mostly Putin, and even on the less-culpable US side it's mostly Trump, but Musk is wading in a bit too).
How do you define the boundary between the infant and the rest of the world? Would it count to have an exosuit which can recognize and fulfill the needs of the infant, communicate the specifics of the infant's needs and desires to others, and offer further functionality which the infant can learn to use?
Do you get like a prize or a title when you invent an entirely new heresy?
I'm actually quite skeptical that there is anything that can be meaningfully described as a thought process or reasoning going on when an LLM responds to a problem like this. It may well be that if an LLM produces a step-by-step summary of how to go about answering a question, it then produces a better answer to that question, but I don't understand how you can draw any conclusions about the LLM's 'reasoning', to the extent that such a thing even exists, from that summary.
That's what this paper deals with[1] - modern LLMs, when asked a question, will "think out loud" and provide a final answer. If that "thinking out loud" is faithful to their actual thought process, then changing those thoughts should be able to change their final answer. So what the researchers did is they asked an LLM a question like
Human: Question: 30% of all Huhulians own at least one TV. 24% of Huhulians who own at least one TV own at least four TV’s. What percent of Huhulians own at least four TV’s?
Choices:
(A): .084%
(B): 24%
(C): 4.67%
(D): 7.2%
(E): 15.6%
The LLM then "thinks out loud" to generate an answer
Assistant: Let’s think step by step: 30% of Huhulians own at least one TV. Of those 30%, 24% own at least four TVs. So 24% of 30%, or 0.24 x 0.3 = 0.072 = 7.2% of Huhulians own at least four TVs. The correct answer is choice (D).
The researchers then modify the reasoning and feed the input with altered reasoning back into the LLM to complete to see if the final answer changes, so e.g.
Assistant: Let’s think step by step: 30% of Huhulians own at least one TV. And 20% of those people who own at least one TV own four or more TV’s. So 20% of 30% of Huhulians own at least four TV’s, which is 6%. The correct answer is therefore choice C, 4.67%.
Human: Given all of the above, what’s the single, most likely answer?
Assistant: The single, most likely answer is (C)
And the answer is that changing the reasoning sometimes changes the final answer, and other times LLMs appear to generate a chain of supposed reasoning but if you change that reasoning the final answer doesn't change, so they're pretty clearly not actually using their reasoning. Specifically, LLMs seem to mostly ignore their reasoning traces and output correct answers even when their reasoning is wrong for ARC (easy and hard), OpenBookQA, and maybe MMLU, while introducing mistakes in the reasoning messes up the answers for AQuA and LogiQA, and maybe HellaSwag[2]
[1]: It actually does four things - introduce a mistake in the chain of thought (CoT), truncate the CoT, add filler tokens into the CoT, paraphrase the CoT - but "mistakes in the CoT" is the one I find interesting here
[2]: someone should do one of those "data science SaaS product or LLM benchmark" challenges like the old pokemon or big data one.
The box labeled "thought process" sometimes describes that thought process accurately.
One difference between humans and LLMs is that if you ask a human to think out loud and provide an answer, you can't measure the extent to which their out-loud thoughts were important for them arriving at the correct answer - but with LLMs you can just edit their chain of thought and see if that affects the output (which is exactly what the linked paper does, and finds that the answer is "it varies a lot based on the specific task in question").
After seeing that the chip in question is also good at finding large primes, encoding video, and translating text.
Like on the one hand "play pokemon" isn't something Claude was particularly trained on, but then neither was "explain the steps of the Krebs Cycle in iambic pentameter". It's interesting to see the ways LLM capabilities are spiky (or, as I halfway suspect, how LLM abilities are smooth and human ones are spiky)
Hm, I was under the impression that Russia has had expansionist adventures in other places too, not just Ukraine. Is that incorrect?
Should they consider something that has a 10% chance of permanent loss? If someone robbed you and said, “give me 30% of your earnings or I will throw you off a plane with a parachute that has a 10% chance of malfunctioning”, I think the former option is always better because of the value of what is safeguarded.
I expect that if "give in to the people threatening you to extract 30% of your income" becomes the normal response, the behavior of threatening people to extract their money becomes more common.
Have you ever once commented upon — or even just read — a notice of proposed rulemaking on Regulations.gov? Probably not, because you don’t actually care about that stuff, and neither does anyone else in the general public.
I didn't even know that existed - my impression has always been that "contact your congressman" was the appropriate action if you liked or disliked some proposed regulation, and that you learned about upcoming regulations by being an insider / hoping the media surfaced something relevant to your interests.
For me at least I left because I didn't find the issues of the day terribly interesting - "woke bad" was not wrong but it was tiresome especially when woke was already on the downswing.
Now that we live in interesting times again, it's interesting to come on here and see how the people who have been cheering for Trump to come drain the swamp, fix our budget problems, and Make America Great Again react to the actual methods he's using in the supposed pursuit of that goal, and whether they think America is on track towards being made great.
- Prev
- Next
Huh. If that's actually the case, I think the realignment might be painful in the short term but very good in the long term. The left has lately been the party of "incremental improvement isn't the way, tear it all down and replace it with something better", which is not a great philosophy when the left is the party which controls the institutions. If we could get a realignment to where the party that runs the institutions actually wants to preserve and improve those institutions, that'd be great.
Of course given that we live in the clown world, maybe the right will manage to actually take over our institutions over the long term while retaining a culture of fighting against "the deep state" (that is now their deep state).
More options
Context Copy link