Why it’s important to be pedantic about sigmas and commas
As of the 23rd May 2022 this website is archived and will receive no further updates.
https://understandinguncertainty.org was produced by the Winton programme for the public understanding of risk based in the Statistical Laboratory in the University of Cambridge. The aim was to help improve the way that uncertainty and risk are discussed in society, and show how probability and statistics can be both useful and entertaining.
Many of the animations were produced using Flash and will no longer work.
The BBC reported last week that evidence for the Higgs Boson is “around the two-sigma level of certainty” and provides further explanation:
Particle physics has an accepted definition for a "discovery": a five-sigma level of certainty. The number of standard deviations, or sigmas, is a measure of how unlikely it is that an experimental result is simply down to chance rather than a real effect”
This is nice and clear, but it is also wrong, as we have pointed out before in a previous blog by Kevin McConway.
The number of sigmas does not say 'how unlikely the result is due to chance': it measures 'how unlikely the result is, due to chance'.
The additional comma may seem staggeringly pedantic (and indeed statisticians have been accused of being even more pedantic about language than lawyers). So what is the problem?
The first, incorrect, 'how unlikely the result is due to chance' applies the term ‘unlikely’ to the whole phrase ‘the result is due to chance’, ie it says that the hypothesis that the Higgs Boson does not exist is unlikely, or equivalently it is likely the Higgs Boson exists.
The second, correct, 'how unlikely the result is, due to chance' applies the term 'unlikely' to the data, and just says that the data is surprising, if the Higgs Boson does not exist. It does not imply that it is necessarily likely that the Higgs Boson exists.
Take Paul the Octopus, who correctly predicted 8 football results in a row, which is unlikely (probability 1/256), due to chance. Is it reasonable to say that these results are unlikely to be due to chance (in other words that Paul is psychic)? Of course not, and nobody said this at the time, even after this 2.5 sigma event. So why do they say it about the Higgs Boson?
This is important - people have been wrongly condemned for murder because this comma was left out. The comma needs to be in there.
- david's blog
- Log in to post comments
Comments
Chuckk (not verified)
Sun, 04/12/2011 - 1:22am
Permalink
Am I missing something?
"...How unlikely it is that an experimental result is simply down to chance rather than a real effect." If it is unlikely to be "simply down to chance," then it is likely to be "a real effect." Which seems to match this:
https://news.slac.stanford.edu/features/word-week-five-sigma
"...Researchers plot the probability that their interesting lump or bump is due to chance alone."
"If that point is more than five sigma....from the center of the bell curve, the probability of it being random is smaller than one in one million."
That makes it sound like they are in fact referring to the probability that the results came about through chance.
In the case of Paul, if he had correctly predicted 100 games in a row, any scientist would indeed say that this was very unlikely to be due to chance- but that some other factor made the results themselves very likely. A good scientist would probably stop short of saying the octopus was psychic, though.
david
Sun, 04/12/2011 - 4:29pm
Permalink
slac
sorry, SLAC have it wrong too. It's the common misunderstanding of what a P-value is.
PaulB (not verified)
Mon, 05/12/2011 - 6:29pm
Permalink
The man in the nonconformist cemetery
Dornfeld
Tue, 03/07/2012 - 4:28am
Permalink
I think the distinction lies
Glenn (not verified)
Wed, 14/12/2011 - 11:44pm
Permalink
Where can I get the source
Chris t (not verified)
Thu, 29/12/2011 - 9:43am
Permalink
Congrats