# Why it’s important to be pedantic about sigmas and commas

The BBC reported last week that evidence for the Higgs Boson is “*around the two-sigma level of certainty*” and provides further explanation:

Particle physics has an accepted definition for a "discovery": a five-sigma level of certainty. The number of standard deviations, or sigmas, is a measure of how unlikely it is that an experimental result is simply down to chance rather than a real effect”

This is nice and clear, but it is also wrong, as we have pointed out before in a previous blog by Kevin McConway.

The number of sigmas does not say *'how unlikely the result is due to chance'*: it measures *'how unlikely the result is, due to chance'*.

The additional comma may seem staggeringly pedantic (and indeed statisticians have been accused of being even more pedantic about language than lawyers). So what is the problem?

The first, incorrect, *'how unlikely the result is due to chance'* applies the term ‘*unlikely*’ to the whole phrase ‘*the result is due to chance*’, ie it says that the hypothesis that the Higgs Boson does not exist is unlikely, or equivalently it is likely the Higgs Boson exists.

The second, correct, *'how unlikely the result is, due to chance'* applies the term *'unlikely'* to the data, and just says that the data is surprising, if the Higgs Boson does not exist. It does not imply that it is necessarily likely that the Higgs Boson exists.

Take Paul the Octopus, who correctly predicted 8 football results in a row, which is *unlikely (probability 1/256), due to chance*. Is it reasonable to say that these results are *unlikely to be due to chance* (in other words that Paul is psychic)? Of course not, and nobody said this at the time, even after this 2.5 sigma event. So why do they say it about the Higgs Boson?

This is important - people have been wrongly condemned for murder because this comma was left out. The comma needs to be in there.

- david's blog
- Log in or register to post comments

## Comments

Chuckk (not verified)

Sun, 04/12/2011 - 1:22am

Permalink

## Am I missing something?

"...How unlikely it is that an experimental result is simply down to chance rather than a real effect." If it is unlikely to be "simply down to chance," then it is likely to be "a real effect." Which seems to match this:

https://news.slac.stanford.edu/features/word-week-five-sigma

"...Researchers plot the probability that their interesting lump or bump is due to chance alone."

"If that point is more than five sigma....from the center of the bell curve, the probability of it being random is smaller than one in one million."

That makes it sound like they are in fact referring to the probability that the results came about through chance.

In the case of Paul, if he had correctly predicted 100 games in a row, any scientist would indeed say that this was very unlikely to be due to chance- but that some other factor made the results themselves very likely. A good scientist would probably stop short of saying the octopus was psychic, though.

david

Sun, 04/12/2011 - 4:29pm

Permalink

## slac

sorry, SLAC have it wrong too. It's the common misunderstanding of what a P-value is.

PaulB (not verified)

Mon, 05/12/2011 - 6:29pm

Permalink

## The man in the nonconformist cemetery

a prioriprobabilities.Dornfeld

Tue, 03/07/2012 - 4:28am

Permalink

## I think the distinction lies

Glenn (not verified)

Wed, 14/12/2011 - 11:44pm

Permalink

## Where can I get the source

Chris t (not verified)

Thu, 29/12/2011 - 9:43am

Permalink

## Congrats

Willard111

Fri, 05/12/2014 - 10:39am

Permalink

## Nice One