In many discussions I'm pulled back to the distinction between not-guilty
and innocent
as a way to demonstrate how the burden of proof works and what the true default position should be in any given argument. A lot of people seem to not have any problem seeing the distinction, but many intelligent people for some reason don't see it.
In this article I explain why the distinction exists and why it matters, in particular why it matters in real-life scenarios, especially when people try to shift the burden of proof.
Essentially, in my view the universe we are talking about is {uncertain,guilty,innocent}
, therefore not-guilty
is guilty'
, which is {uncertain,innocent}
. Therefore innocent ⇒ not-guilty
, but not-guilty ⇏ innocent
.
When O. J. Simpson was acquitted, that doesn’t mean he was found innocent, it means the prosecution could not prove his guilt beyond reasonable doubt. He was found not-guilty, which is not the same as innocent. It very well could be that the jury found the truth of the matter uncertain
.
This notion has implications in many real-life scenarios when people want to shift the burden of proof if you reject a claim when it's not substantiated. They wrongly assume you claim their claim is false (equivalent to innocent
), when in truth all you are doing is staying in the default position (uncertain
).
Rejecting the claim that a god exists is not the same as claim a god doesn't exist: it doesn't require a burden of proof because it's the default position. Agnosticism is the default position. The burden of proof is on the people making the claim.
Jump in the discussion.
No email address required.
Notes -
Are you sure about that? Maybe you consider the distribution, and maybe some Bayesians do consider the distribution, but I've debated Scott Alexander, and I'm pretty sure he used a single number to arrive to the conclusion that doing something was rational.
I've been writing about uncertainty in my substack and I've felt a substantial amount of pushback regarding established concepts such as the burden of proof, not-guilty is not the same as innocent, and the null hypothesis implies uncertainty. Even ChatGPT seems to be confused about this.
I'm pretty certain that most people--even rationalists--do not factor uncertainty by default, which is why I don't think Bayesians thoroughly consider the difference between: 0/0, 50/50, or 500/500.
Now this one is a question that a Bayesian would typically answer with a single number. In reality it is either true or false, but due to uncertainty I'd use a single number and say something like, I'm 98% sure. This works for any event, or yes-no question. Will Russia detonate a nuclear weapon in 2023? Etc. You could say you're also giving a distribution here, but since you can only have two outcomes, once I say I'm 6% sure Russia will throw a nuke, I'm also saying 94% they won't, so the full distribution is defined by a single number.
But the coin bias in reality is not true or false. It's 45%, 50.2%, or any number 0-100, so you need a full distribution.
I don't know what to say, haven't read. I'll take a guess that there was taking past each other. It seems to me you think Bayesians around here don't like to consider the uncertainty behind those three concepts. But it's the other way around. Bayesians want a more fine description of the uncertainty than those 3 concepts allow. It's like with the coin example, where you suggested two numbers but a Bayesian would use a full distribution.
So, when there's a binary event, like the Russia nuke question, a Bayesian says 6% probability, but a "burden-of-proofer" may say "I think the people that claim Russia will throw a nuke have the burden of proof", a null-hypothesis-er would say "the null hypothesis is that Russia will not throw a nuke", etc. These concepts don't give the uncertainty with enough resolution for a Bayesian. They only give you 2 or 3 options: burden on one side vs the other, guilty vs innocent vs not-guilty.
I'm not asking if the coin is biased, I'm asking if the next coin flip will land heads. It's a yes-or-no question that Bayesians would use a single number to answer.
No, I say "I don't know" (uncertain), which cannot be represented with a single probability number.
Yeah
But at first it seems to me you were talking about the bias and what you can learn about it from repeated tosses (and were confused in thinking Bayesians wouldn't learn).
So like we've talked, they'd use many numbers to compute the probability of the yes-no question, they just give the final answer as one number. Bayesians do consider uncertainty, to all levels they feel they need. What they don't do is give uncertainties about uncertainties in their answers. And they see the probability of next toss heads as equivalent to "how certain am I that it's going to be heads?" (to a Bayesian, probabilities are also uncertainties in their minds, not just facts about the world). Iiuc, you would be happy saying you believe the next toss has 50%±20 chances of being heads. Why not add uncertainty to the 20% too since you are not sure it should be exactly 20%, as in 50%±(20±5)%? If that feels redundand in some sense, that's how a Bayesian feels about saying "coin will come up heads, I'm 50% sure, but I'm only 30% sure of how sure I am.". If it doesn't feel redundant, add another layer until it does :P
Still, I think I see your point in part. There is clearly some relevant information that's not being given in the answer if the answer to "will this fair coin land heads?", 50%, is the same as the answer given to "plc ashetn ðßh sst?" (well-posed question in a language I just invented), now a lame 50% meaning "the whaat huuhhh?".
But they don't do that, they give a single number. Whatever uncertainty they had at the beginning is encoded in the number
0.5
.Later on when their decision turns out to be wrong, they claim it wasn't wrong, because they arrived at that number rationally, nobody would have arrived to a better number.
It's not just about the answer is given, it's about how the answer is encoded in your brain.
If the answer to some question is "blue", it may not be entirely incorrect, but later on when you are asked to recall a color you might very well pick any blue color. On the other hand if your answer was "sky blue", well then you might pick a more accurate color.
I claim the correct answer should be
50%±50%
, but Bayesians give a single answer:50%
, in which case my answeruncertain
is way better.The correct answer depends on what the question is.
If the question is "what's the color of that thing you last saw 5 days ago?", Bayesians would be just like you and answer "blue" and not "sky blue #020fe8".
When you ask "how will the next coin toss land?", an answer disregarding uncertainty would be "it will be heads". An answer that takes uncertainty into account could be "I'm almost sure it will be heads", or "I suspect it will be tails", or "I haven't got a clue". A Bayesian would phrase those as "95%" (almost sure heads), "40%" (suspect tails), or "50%" (no idea).
In Bayesianese, answering that specific question with "50%+-50%" would mean something like "I have no clue if I have a clue whether the next coin toss will be heads or tails", which sounds weird. So I am inferring that you mean "50%+-50%" as an answer the a slightly different question, such as "how frequently would this coin land heads over many tosses?". Which one may phrase as "what's the probability that this coin comes up heads if I toss it?"; but then with this phrasing, a (subjective) Bayesian during a nitpicky philosophical discussion might parse it as "how will the next coin toss land (please, answer in a way that conveys your level of uncertainty)?". That's why I suspect there was talking past each other in your discussions with other people.
But that's not Bayesian. That's the whole point. And you accepted they use a single number to answer.
You: They use a single number for probabilities. They should use 2 like 50%+-20%
Me: Yes, they use a single number. No they shouldn't use 2 when they interpret probability as meaning subjective uncertainty. They should if they interpret it to mean something obejctive.
You: They don't learn from multiple coin tosses, they would need more than one number for that.
Me: They do learn. They use many numbers to compute.
You: They don't take uncertainty into account.
Me: They do, the probability is the uncertainty of the event.
You: 50%+-20% is analogous to saying "blue" whereas saying 50% is analogous to saying "sky blue".
Me: Not if probability means uncertainty. Then 50% maps to "blue", and 50%+-20% maps to nonsense.
You: My answer is correct.
Me: It depends on the question.
I'm not sure what's left here to discuss. I didn't get this follow up.
Right. Which is what that very sentence you half quoted explains.
They don't. The probability that the next coin flip is going to land heads is the same:
0/0
,50/50
,5000/5000
is0.5
. It does not get updated.No. It's not.
p=0.5
is not the uncertainty.I didn't say that.
Which is not the case.
There is no other question. I am asking a specific question, and the answer is
p=0.5
, there's no "it depends".p=0.5
is the probability the next coin flip is going to land heads. Period.I'm going to attempt to calculate the values for
n
number of heads and tails with 95% confidence so there's no confusion about "the question":0/0
:0.5±∞
5/5
:0.5±0.034
50/50
:0.5±0.003
5000/5000
:0.5±0.000
It should be clear now that there's no "the question". The answer for Bayesians is
p=0.5
, and they don't encode uncertainty at all.More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link