The risks of Big Data – or why I am not worried about brain tumours.

As of the 23rd May 2022 this website is archived and will receive no further updates.

was produced by the Winton programme for the public understanding of risk based in the Statistical Laboratory in the University of Cambridge. The aim was to help improve the way that uncertainty and risk are discussed in society, and show how probability and statistics can be both useful and entertaining.

Many of the animations were produced using Flash and will no longer work.

In a careful study published last week, Socioeconomic position and the risk of brain tumour: a Swedish national population-based cohort study, the authors examined the association between the socio-economic status of men and women in Sweden with diagnosis of brain tumours over 18 years. One of the main findings is shown below.


Part of a table showing a significantly increased incidence of gliomas in men with more education: the final column shows the relative risks adjusted for marital status and income.

The press release was moderately over-enthusiastic. It correctly said that no causal interpretation could be made, and that no adjustment had been made for lifestyle confounders such as alcohol consumption. It did not mention the authors' warning that people of higher economic status are likely to have a greater tendency to seek care, and hence there could be reporting bias. And the press release did say that "A university degree is linked to a heightened risk of developing a brain tumour", in spite of the jump in incidence occurring at secondary level rather than university, and that the study was primarily about socio-economic status, not education.

But one important feature was not mentioned at all: the small size of the apparent association. A 19% increase between the lowest and highest educational levels is much lower than is found for many cancers. The paper reported that 3715 gliomas were diagnosed in over 2,000,000 men over 18 years and so, following the standard recommendation to translate relative risks into changes in absolute risk, this means that:

  • Out of around 3,000 men of the lowest educational level, we would expect 5 gliomas to be diagnosed
  • Out of 3,000 men of the highest educational level, we would expect 6.

This gives a somewhat different impression of the findings, and is in fact rather reassuring. Such a small increased risk in a rare cancer could only be found to be ‘statistically significant’ when huge numbers of people are studied: in this case over 4,000,000 men and women. So for me the main lessons from this good scientific study are -

  • that ‘big data’ can easily lead to findings that are statistically but not practically significant
  • that I should not be concerned that my degrees are going to give me a brain tumour.

But it’s just as well I don’t take headlines too seriously, as otherwise the classic example below would be very worrying .....


A classic headline from the Daily Mirror, a fine contender for the Most Misleading Scientific Headline of the Year, 2016.