Higgs: is it one-sided or two-sided?

As of the 23rd May 2022 this website is archived and will receive no further updates.

understandinguncertainty.org was produced by the Winton programme for the public understanding of risk based in the Statistical Laboratory in the University of Cambridge. The aim was to help improve the way that uncertainty and risk are discussed in society, and show how probability and statistics can be both useful and entertaining.

Many of the animations were produced using Flash and will no longer work.

Announcements about the Higgs Boson are invariably framed in terms of the number of sigmas, with 5-sigmas needed for a ‘discovery’. Media outlets helpfully explain what this means by translating 5-sigmas to a probability, which is almost invariably misreported as a probability of the hypothesis that it is all just statistical error e.g. “meaning that it has just a 0.00006% probability that the result is due to chance” [Nature] (see bottom of this blog for comments about the misinterpretation).

But the Daily Telegraph says that 5 sigma is equivalent "to meaning it is 99.99997 per cent likely to be genuine rather than a fluke" - this is a P-value of 0.00003%. So is 5-sigmas equivalent to 0.00003% ( 1 in 3,500,000) or 0.00006% (1 in 1,750,000)?

This reflects whether one is quoting a probability of a Normal observation being more than 5-sigmas away from the expected value in the direction of interest (one-sided), or either direction (two-sided). The two-sided P-value is twice the one-sided, and therefore looks less interesting. The Telegraph is using a one-sided, Nature uses two-sided, who is right?

It’s best to go back to a paper from CERN, eg the ATLAS team announcing their previous results. There they say that "The significance of an excess is quantified by the probability (p0) that a background-only experiment is more signal-like than that observed."
which is excellent and clear. The global P-value is calculated through a sophisticated method that allows for the multiple tests that have been done (the 'look-elsewhere' effect), and the sigma interpretation given afterwards using the graphs in Fig 3 of the paper. The translation is clearly equivalent to a one-sided test - for example they quote 1.4% as being equivalent to 2.2 sigma. And so Nature is wrong: 5-sigmas should be interpreted as a 1 in 3,500,000 chance that such results would happen, if it were all just a statistical fluke.

This is all rather bizarre: the correct (2-sided) P-value is calculated by the scientists, which they translate into sigmas (using a 1-sided interpretation), but then the sigma is then translated back by journalists to a P-value, often wrongly.

What is a P-value anyway?

As discussed previously, the P-values are almost invariably interpreted incorrectly. The probability, or P-value, refers to the probability of getting such an extreme result, were there really nothing special going on. The probability should be applied to the data, not the hypothesis. This may seem pedantic, but people have been convicted of murder (Sally Clark) because of this mistake being made in court. This quantumy blog gets it right and has got more explanation. The BBC website now has a reasonably good, if slightly ambiguous, definition

“The number of sigmas measures how unlikely it is to get a certain experimental result as a matter of chance rather than due to a real effect”

but would be much much better if there were a comma after the word ‘result’.

Comments

Thanks for your expertise on this one David - Nature has corrected its story and won't be making this mistake again, I hope.

I noticed reporters applying the probability to the hypothesis in every article that I read which discussed the evidence. This motivated a blog post in which I discuss how reporters (mis)interpret p-values and why it matters. I'm surprised you didn't focus more on that--it seems way more important. I'd be curious to see if you agree, though. http://blog.carlislerainey.com/2012/07/07/innumeracy-and-higgs-boson/

It seems clear that most scientists (and many statisticians) misinterpret Pr(data or more extreme|H0) as Pr(H0|data). It also seems clear that scientific questions concern Pr(H0|data), and that Pr(data or more extreme|H0) is more for quality control purposes. Yet, in science and in many other areas where decisions are based on data, we continue to use sampling theory. My impression is that it is simply not possible to obtain Pr(H0|data) or Pr(H1|data) without resorting to Bayes' rule. Is this true? ... or is there some secret sampling theory argument that can yield Pr(H0|data)... or some approximation to it ... without Bayes' rule? Thank you.

0,00003 or 1 in 3,333,333 and 0,00006 or 1 in 1,666,666