david's blog
http://understandinguncertainty.org/davidsblog
enHow many hours of life did Obama lose in Delhi?
http://understandinguncertainty.org/how-many-hours-life-did-obama-lose-delhi
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p><img src="/sites/understandinguncertainty.org/files/obama-delhi.png" width="600" height="405" alt="obama-delhi.png" /></p>
<p>President Barack Obama recently spent 3 days in Delhi, and it’s claimed that during this period the air pollution knocked 6 hours off his life. So who was responsible for this number?</p>
<p>Well, …. me. I gave this estimate to a Bloomberg journalist based in New Delhi who had got hold of me through some <a href="http://qz.com/197695/how-dangerous-is-the-air-today-count-how-many-microlives-pollution-is-stealing-from-you/">previous coverage</a> of our concept of ‘microlives’ applied to air pollution.</p>
<p>So where did I get the ‘6 hours’ figure from? We need to go through a number of stages.</p>
<ol><li>An <a href="http://www.nejm.org/doi/full/10.1056/NEJMsa0805646">authoritative study </a>in the New England Journal of Medicine examined changes over 40 years of fine-particulate data in 211 counties in the USA, and estimated that a decrease of 10 μg/m3 in PM2.5 (fine particulates) in living environment is associated with a gain of 0.61 years in life expectancy. </li>
<li>This corresponds to around 1% of an adult life of say 55 years, and so, looking at it from a negative point-of-view, we could also say that living somewhere with an increase of 10 μg/m3 in PM2.5 is associated with around 1% off your life expectancy.</li>
<li>1% of a day is 15 minutes. So, pro-rata, exposure to an extra 10 μg/m3 in PM2.5 is equivalent to losing 15 minutes off your life expectancy per day.</li>
<li>Smoking 20 cigarettes a day takes around 10 years off your life-expectancy, say 20% of an adult lifetime. So, again pro-rata, it’s as if a daily cigarette is taking 1% off your life-expectancy, or 15 minutes each day.</li>
<li>Thus, very roughly, a day exposed to an additional 10 μg/m3 in PM2.5 is equivalent to smoking a cigarette.</li>
<li>We can now make a comparison between any two cities, say Washington (average daily exposure 15 μg/m3 of PM2.5), and Delhi (average 84 μg/m3 of PM2.5 during Obama's visit). The difference is around 70 μg/m3, so the published estimates suggest this would take roughly 7 x 0.61 = 4 years off an average permanent resident’s life. And each day’s exposure to this is equivalent to smoking 7 or 8 cigarettes, and it is as if it takes 2 hours off their life.</li>
<li>Using this metric, Obama’s 3-day stay could be considered similar to smoking around 24 cigarettes, taking 6 hours off his life expectancy.</li>
</ol><p>A few notes.</p>
<p>According to Bloomberg, Delhi averaged a rather frightening 157 μg/m3 of PM2.5 in 2013. Compared to Washington, or London, this is around 140 μg/m3 higher, equivalent to 14 cigarattes a day, around 8 years off your life, or 3.5 hours off for each day. Another way of looking at this is that the residents of Delhi are going towards their deaths at around 27 or 28 hours each day, rather than the standard 24.</p>
<p>An alternative way of describing this is through the ‘microlife’. Since 57 years, around an adult lifetime, is a million 1/2 –hours, we can consider losing 30 minutes life expectancy is essentially costing you a millionth of your adult life, or what we call a ‘microlife’. Thus 2 cigarettes, or around 6 hours in Delhi during Obama's stay, is equivalent to losing a microlife, and an average day in Delhi may be about 4 to 7 microlives, depending on the PM2.5 levels. See <a href="http://en.wikipedia.org/wiki/Microlife">Wikipedia article</a> and this <a href="http://www.bmj.com/content/345/bmj.e8223"> British Medical Journal</a> paper for more discussion and examples. </p>
<p>To try and pre-empt criticism, I fully understand that I am committing two cardinal sins of epidemiology. First, I am, at least implicitly, implying causality where the data can only provide an association. However the causal harm of cigarettes is now undisputed, while that of fine particulates is fairly non-controverisal (and the New England Journal of Medicine study is longitudinal). </p>
<p>Second, and more important, I am taking measures that have been derived from large populations followed over long periods, and applying them to individuals experiencing a single day. But I regard this as a valid form of ‘numerical metaphor’. It is not literally true, as it is impossible to say what the effect of a short stay in Delhi was on Barack Obama’s long-term health. I certainly cannot prove he lost six hours life-expectancy. But I bet it didn't do him any good. </p>
<p>This is an unashamedly 'popular' way of communicating chronic risk. And judging by the large additional coverage of this story, it seems to have worked. And I should add that I really like Delhi, and will be back there in the autumn, in spite of the air. </p>
</div></div></div>Sun, 01 Feb 2015 15:04:50 +0000david7993 at http://understandinguncertainty.orghttp://understandinguncertainty.org/how-many-hours-life-did-obama-lose-delhi#commentsLuck and Cancer
http://understandinguncertainty.org/luck-and-cancer
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>I was on Radio 4 PM (<a href="http://www.bbc.co.uk/programmes/b0089nbb">starting at 37:09</a>) and BBC News Channel yesterday discussing the study published in Science "<a href="http://www.sciencemag.org/content/347/6217/78.abstract"> Variation in cancer risk among tissues can be explained by the number of stem cell divisions </a>". This had been reported by much of the press as showing that <a href="http://www.express.co.uk/life-style/health/549764/Study-reveals-cancer-down-to-just-sheer-bad-luck ">“the majority of cancer cases are down to sheer bad luck”</a>. But the study made no such claim, and so how did these headlines come about? </p>
<p>The main findings are shown in the summary figure from <a href="http://news.sciencemag.org/biology/2015/01/simple-math-explains-why-you-may-or-may-not-get-cancer">Science's News section</a>, reproduced in a slightly edited form below.</p>
<div class="captionCentre"><img src="/sites/understandinguncertainty.org/files/cancer-luck.png" width="576" height="533" alt="cancer-luck.png" /><p class="caption">Figure from Science: blue dots added by me indicating the types the authors claim form a 'cluster'. The headline underneath is Science's own, and is misleading in that it takes an analysis of aggregate risks in populations and applies it to individuals.</p>
</div>
<p>This shows that organs with a large number of lifetime stem-cell divisions have higher incidences of cancer, which is hardly very surprising. The correlation is 0.81, which is squared to produce an $R^2$ of 0.65, which they interpret as meaning around two-thirds of the variation in incidence rates is explained by chance mutations of stem-cells. The authors <a href="http://www.sciencemag.org/content/347/6217/78.abstract">conclude in their abstract</a> that “only a third of the variation in cancer risk among tissues is attributable to environmental factors or inherited predispositions”, which may be a fairly reasonable statement to make about population rates in different tissues, but of course says nothing about variation in risks between individuals, and certainly does not say that two-thirds of <em>cases</em> are just luck.</p>
<p>But you can see how this mistake happens. For example, <a href="http://www.reuters.com/article/2015/01/01/health-cancer-luck-idUSL1N0UE0VF20150101">Reuters report the findings</a> as “two-thirds of cancer incidence of various types can be blamed on random mutations”, which is reasonable, but the sub-editor then produces the headline “Biological bad luck blamed in two-thirds of cancer cases”. But this misinterpretation is perhaps hardly surprising as Science itself writes on its main web-site <a href="http://www.sciencemag.org ">“Analysis linking number of stem cell divisions to different cancer risks suggests most cancer cases can’t be prevented”</a>. If feeling generous, this could be interpreted as reporters ‘simplifying’ the language, and so getting it wrong.</p>
<p>But the authors themselves cannot escape blame. They proceeded to use some over-elaborate cluster analyses to claim that the points fell into two groups: 9 cancers that had a higher incidence than expected due to the random mutations [which I have marked as blue in the Figure above], and the remainder (22) whose incidence were explainable by chance alone. But this separation into two groups seems fairly arbitrary: can you see two natural clusters in the Figure? No, neither can I. </p>
<p>And even if the cluster analysis were OK, the authors only selected cancers for which stem cell divisions could be estimated, and did not include common cancers such as breast and prostate, and yet broke osteosarcoma down into five categories. So any claim about proportions of cancer-types is misleading. But whether the ‘two-thirds’ came from the $R^2$, or the 22/31 cancer types that were 'just luck', journalists almost universally interpreted this as ‘the majority of cancer cases”, which had never been claimed by the authors.</p>
<p>Overall, any blame for inappropriate reporting does not lie only with the media. It also lies with the scientists and the way the journal reported the study: see the recent study that <a href="http://www.sciencemediacentre.org/abandon-hype-all/">showed in detail the inappropriate coverage stemming from press releases</a>.</p>
<p>In the end it is perhaps not worth making much fuss about, as -
</p><ul><li>It’s already recognised that the majority of cancers are not preventable by lifestyle changes: <a href="http://www.cancerresearchuk.org/about-us/cancer-news/press-release/2014-12-26-lifestyle-behind-more-than-half-a-million-cancers-in-five-years">Cancer Research UK’s analysis said around 40%</a> might be preventable, which means 60% are not</li>
<li>To quote <a href="http://ije.oxfordjournals.org/content/33/6/1183.short">Sir Richard Doll</a>, "whether an exposed subject does or does not develop a cancer is largely a matter of luck".
</li>
</ul><p>So the basic messages that were reported were perhaps rather accurate, even if they were not justified by the study itself.</p>
<p>NB See also <a href="http://pb204.blogspot.co.uk/2015/01/science-by-press-release.html ">Plumbum's fine blog </a>on this, which has further links to other commentaries.</p>
<p>PS [added 3rd Jan] It is important, but tricky, to distinguish the study's concern with the role of random mutations in <em>population</em> cancer incidence in different tissues, from the role of luck in an <em>individual</em> getting cancer. The media have (incorrectly) generally written the story as if it concerned the latter. One way of seeing that the analysis in the Science paper does not address the role of 'luck' in the individual case is to think of what they might have said had their correlation been zero. Would this have meant that there was no "luck" in who got cancer? Obviously not. See <a href="http://www.dcscience.net/Davey-Smith-2011.pdf">George Davey-Smith's 2011 lecture</a> for a lot more on this.</p>
<p>PPS [added 4th Jan] To elaborate on Doll's idea that getting cancer is "largely a matter of luck": risk factors such as smoking can increase an individual's risk, but whether one actually gets a cancer is still a matter of unpredictable chance. On PM I used my usual analogy of a lottery: there are tickets in a bucket marked cancers of different types, and a lot of blank tickets (and some marked 'run over by bus' etc). Smoking means you might get 20 times as many 'lung-cancer' tickets, but you still may be lucky and not draw one: many smokers don't get lung cancer. So chance plays a very strong role, even in so-called preventable cancers. This leads to the apparently paradoxical observation that most lung cancers are 'caused' by smoking, while all lung cancers are also a matter of bad luck. As pointed out by George Davey-Smith, it is not an either/or argument.</p>
</div></div></div>Sat, 03 Jan 2015 12:19:39 +0000david7946 at http://understandinguncertainty.orghttp://understandinguncertainty.org/luck-and-cancer#commentsSub-editing in the Times
http://understandinguncertainty.org/sub-editing-times
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>A <a href="http://www.thetimes.co.uk/tto/health/news/article4283519.ece">story in monday's Times </a>had the following dramatic headline:<br /><img src="/sites/understandinguncertainty.org/files/u1/errors-screening-headline-sm.jpg" width="600" height="170" alt="errors-screening-headline.jpg" /></p>
<p>I started the article with interest, wondering what flaws in the breast screening programme had been exposed. But the article turned out to be a good description by Chris Smyth of a study of how Ashkenazi women, who are at high risk of carrying the BRCA gene, which in turn increases the risk of breast and ovarian cancer, were not being adequately screened for the gene. Systematically testing all this sub-population would be cost-effective. </p>
<p>All very good, but little to do with the failure in breast cancer screening apparently proclaimed by the headline. Even worse, the front-page trail for the story said "Women should be offered tests for gene mutations that raise their risk of breast cancer", which also managed to suggest that the issue applied to everyone instead of higher-risk Ashkenazi women. </p>
<p>I hope I'm not being pedantic, and I know it's the job of sub-editors to get people to read the story - it worked in my case - but this all seems a bit shabby.</p>
</div></div></div>Tue, 02 Dec 2014 08:47:13 +0000david7922 at http://understandinguncertainty.orghttp://understandinguncertainty.org/sub-editing-times#commentsIs prostitution really worth £5.7 billion a year?
http://understandinguncertainty.org/prostitution-really-worth-%C2%A357-billion-year
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The EU has demanded rapid payment of £1.7 billion from the UK because our economy has done better than predicted, and some of this is due to the prostitution market now being considered as part of our National Accounts and contributing an <a href="http://www.telegraph.co.uk/news/worldnews/europe/eu/11184605/Explainer-Why-must-Britain-pay-1.7bn-to-the-European-Union-and-can-we-stop-it-happening.html">extra £5.3 billion to GDP at 2009 prices</a>, which is 0.35% of GDP, half that of agriculture. But is this a reasonable estimate?</p>
<p>This £5.3 billion figure was assessed by the <a href="http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-360136">Office of National Statistics in May 2014 </a>based on the following assumptions, derived from <a href="http://www.ons.gov.uk/ons/rel/naa1-rd/national-accounts-articles/inclusion-of-illegal-drugs-and-prostitution-in-the-uk-national-accounts/index.html">this analysis</a>. To quote the ONS:</p>
<ul><li>Number of prostitutes in UK: 61,000</li>
<li>Average cost per visit: £67</li>
<li>Clients per prostitute per week: 25</li>
<li>Number of weeks worked per year: 52</li>
</ul><p>Multiply these up and you get £5.3 billion at 2009 prices, around £5.7 billion now.</p>
<p>This assessment has been severely questioned. Dr Brooke Magnanti, aka Belle de Jour, reckoned <a href="http://www.telegraph.co.uk/women/sex/10864898/Prostitution-adds-5bn-a-year-to-UK-economy.-Are-you-having-a-laugh.html">it might be ten times too high</a>. In contrast others have said <a href="http://blog.import.io/?author=52e28dc1e4b0377cec9ac9d0">it should be £9 billion as it ignores male prostitution</a>. Jolyon on <a href="http://www.taxrelief4escorts.co.uk/2014/06/01/does-prostitution-really-contribute-5-3bn-to-uk-gdp/">Tax Relief 4 Escorts</a>, who claims a maths degree from Cambridge, has done a detailed critique. He points out the flaws in the survey on which the 61,000 is based, and claims the assumed workload is too high and that the cost per visit (which the ONS based on <a href="http://www.punternet.com/index.php">PunterNet</a>) seems too low: it is somewhat ironic that the ONS use an information source that a previous minister, Harriet Harman, <a href="http://www.independent.co.uk/news/uk/home-news/punter-net-prostitutes-thank-harriet-harman-for-publicity-boost-1796759.html"> tried to shut down</a>.</p>
<p>My feeling is that the assumption that has the most problems is the workload. ONS are suggesting that the average person who works in prostitution has around 1,250 clients a year. This is based on Dutch experience, whereas the pattern of working in the UK is likely to be very different, with a complex industry comprising street-walkers, escorts, the informal market, those who work from fixed premises and 'independents' who advertise, for example, on <a href="http://www.adultwork.com">AdultWork</a>. Many are part-time. </p>
<p>As always, it's best to do a simple reality check. The ONS assumptions come to around 75,000,000 visits a year. Let's say 60,000,000 are from locals rather than foreign visitors, which is more than a million a week. There are around 20,000,000 men between 18 and 65 in the UK (taking an arbitrary upper limit), so this would mean that on average each of them buys sex three times a year. In fact the latest Natsal survey found that <a href="http://www.thelancet.com/journals/lancet/article/PIIS0140673613620358/table?tableid=tbl2&tableidtype=table_id&sectionType=red">around 4% of men between 18 and 65 reported paying for sex in the last 5 years</a>, that's about 800,000 men. If there were really more than a million visits a week, then the average man who paid for sex at any time in the last 5 years, did so considerably more often than once a week. In fact the proportion who pay for sex each year will probably be less than 2%, which means that less than 400,000 men are taking up over a million visits each week - that's around once every 3 days for each of the 400,000. I am no expert on the behaviour of this subgroup, but this does seem rather high, to say the least: a <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2563847/">study of men who pay for sex in Scotland</a> found a mean of only 5 partners in a year.</p>
<p>The assumptions also mean that the average person working in prostitution is turning over nearly £100,000 a year, which Jolyon from Tax Relief 4 Escorts says is completely implausible, and he should know.</p>
<p>Although this is a big statistical challenge, such an important contribution to the economy deserves a more robust analysis. When better figures come out I predict the UK will be due a substantial rebate. But that won't help David Cameron now.</p>
<p><em>27th October: Some figures have been revised since first posting, but the gist stays the same.</em></p>
</div></div></div>Sat, 25 Oct 2014 16:20:40 +0000david7853 at http://understandinguncertainty.orghttp://understandinguncertainty.org/prostitution-really-worth-%C2%A357-billion-year#commentsWhy 'life expectancy' is a misleading summary of survival
http://understandinguncertainty.org/why-life-expectancy-misleading-summary-survival
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>It's well-known how misleading it can be to use average (mean) as a summary measure of income: the distribution is very skew, and a few very rich people can hopelessly distort the mean. So median (the value halfway along the distribution) income is generally used, and this might fairly be described as the income of an <em>average person</em>, rather than the <em>average income</em>. </p>
<p>But, like everyone else dealing with actuarial statistics, I use life expectancy (the mean number of future years) to communicate someone's survival prospects. And yet, just as for income, it is also a poor measure due to the skewness of the distribution of survival.</p>
<p>This can be clearly shown by looking at the <a href="http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-325699 ">life tables published by the Office for National Statistics (ONS) </a>: these have a convenient column labeled $d_x$, which is the probability density for survival, expressed as the expected number of deaths at each age out of 100,000 births, assuming the current mortality rates continue. The density plots for women and men are shown below, using the life tables for 2010-2012. The distributions have a small peak for babies dying in the first year of life, and then a long left-tail for early deaths, and then a sharp peak and a rapid fall up to age 100. The ‘compression’ of mortality is clear. </p>
<div class="captionCentre"><img src="/sites/understandinguncertainty.org/files/density-female.png" width="566" height="342" alt="density-female.png" /><p class="caption">Numbers of women expected to die at each age, out of 100,000 born, assuming mortality rates stay the same as 2010-2012. The expectation is 83, median 86, the most likely value (mode) is 90. </p>
</div>
<div class="captionCentre"><img src="/sites/understandinguncertainty.org/files/density-men.png" width="566" height="342" alt="density-men.png" /><br /><p class="caption">Numbers of men expected to die at each age, out of 100,000 born, assuming mortality rates stay the same as 2010-2012. The expectation is 79, median 82, the most likely value (mode) is 86.</p>
</div>
<p>Left-skewed distributions are rather unusual, but have similar issues as any skew distribution - the mean, median and mode can be very different. For these survival distributions it is perhaps remarkable how far the mode is from the mean: for girls born now, even assuming there are no more increases in survival, their most likely age to die is 90, seven years more than the mean on 83. For little baby boys the mode is 86, again seven years more than the mean of 79. And even the median is 3 years more than the mean. That's why I now believe that 'life expectancy' is misleading.</p>
<p>Of course these ‘period life tables’ unrealistically assume mortality will stay the same in the future, whereas life expectancy has been growing at around 3 months a year for decades, corresponding to the annual risk of death reducing at about 2% per year. The ONS also provide <a href="http://www.ons.gov.uk/ons/rel/lifetables/historic-and-projected-data-from-the-period-and-cohort-life-tables/2012-based/stb-2012-based.html ">‘cohort life tables’</a> that make various projections about whether these trends will continue in the future: the 'central projection' says girls born now have a life expectancy of 94, with (according to my rough calculations) a median and mode of around 100, and men have a life expectancy of 91, with a median and mode of around 96. Under the ‘high' projections, with the possibly implausible assumption that the increases continue at the same rate in the future, children born today will on average live more than 100 years. Good luck to them - heaven knows how long they will have to work for.</p>
</div></div></div>Mon, 22 Sep 2014 19:14:52 +0000david7792 at http://understandinguncertainty.orghttp://understandinguncertainty.org/why-life-expectancy-misleading-summary-survival#commentsUsing expected frequencies when teaching probability
http://understandinguncertainty.org/using-expected-frequencies-when-teaching-probability
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The July 2014 <a href="https://www.gov.uk/government/publications/national-curriculum-in-england-mathematics-programmes-of-study">Mathematics Programmes of Study: Key Stage 4 </a>(GCSE) specifies under <em>Probability</em> </p>
<blockquote><p><strong>{calculate and interpret conditional probabilities through representation using expected frequencies with two-way tables, tree diagrams and Venn diagrams}.<br /></strong></p></blockquote>
<p>- the brackets and bold case means this comes under <em>additional mathematical content to be taught to more highly attaining pupils</em>. </p>
<p>The use of the term ‘expected frequencies’ is novel and not widely known in mathematics education. The basic idea is very simple: instead of saying “<em>the probability of X is 0.20 (or 20%)</em>”, we would say “<em>out of 100 situations like this, we would expect X to occur 20 times</em>”.</p>
<p>‘<em>Is that all?</em>’ I hear you cry. But this simple re-expression can have a deep impact. The idea is strongly based on research in risk communication, in particular the work of Gerd Gigerenzer and others who use the term ‘natural frequencies’. Extensive research (see selected references at the bottom) have shown this representation can prevent confusion and make probability calculations easier and more intuitive.</p>
<p>The first point is that it helps clarify what the probability means. When we hear the phrase ‘<em>the probability it will rain tomorrow is 30%</em>’, what do we mean? That it will rain 30% of the time? Over 30% of the area? In fact it means that out of 100 such computer forecasts, we can expect it to rain after 30 of them. By clearly stating what the ‘denominator’ is, ambiguity is avoided. It has been shown that by using expected frequencies, people find it easier to carry out non-intuitive conditional probability calculations.</p>
<p>Expected frequency is the standard format taught to medical students for risk communication, and is used extensively in public dialogue. Examples include the QRISK program and the current leaflets for breast cancer screening.</p>
<div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/qrisk.png" width="507" height="261" alt="qrisk.png" /><p class="caption">Output from the QRISK program using expected frequencies – the most widely used tool in general practice for assessing and communicating cardiovascular risk
</p>
</div>
<div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/breast-screening.png" width="334" height="402" alt="breast-screening.png" /><p class="caption">An image from the current breast screening information leaflet from the NHS Screening Programme, showing the use of expected frequencies to communicate the chances of different events subsequent to a mammogram
</p>
</div>
<p>In teaching probability, expected frequencies can be used in their own right, or as a tool for doing more complex probability calculations. Perhaps the ideal representation is using ‘icon arrays’, as in the QRISK example, but these cannot be drawn by students and are inappropriate for small probabilities. Therefore tree representations are appropriate, although as noted in the Programme of Study, two-way tables and Venn diagrams can also be used and will be illustrated below . They can be introduced gradually, possibly using the framework shown below, in which some sample questions and a fewe solutions are provided.</p>
<h3>1. Basic probability. </h3>
<p>This is essentially a one level tree. Questions can involve going either from probabilities (expressed as decimals, fractions and %’s) to expected frequencies, or vice versa. The problems can be drawn as either expected frequency or probability trees, as shown for the following questions. The actual questions could be provided in different ways, for example with some entries in a tree provided and the student asked to complete the tree.</p>
<p><em>Going from probability to expected frequency<br /></em></p>
<ul><li>Some balanced dice have probability 1/6 of coming up ‘4’. Out of 60 throws, how many ‘4’s would we expect to come up? </li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/dice.png" width="605" height="230" alt="dice.png" /><p class="caption">Probability and frequency trees for dice </p>
</div>
<p>.</p>
<ul><li>80% of the school students can roll their tongues. If I pick 1000 students at random, how many do you expect will NOT be able to roll their tongues? </li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/tongue.png" width="605" height="204" alt="tongue.png" /><p class="caption">Equivalent probability and frequency trees for tongue-rolling</p>
</div>
<p>.</p>
<ul><li>There is a 0.02 probability of winning some prize with a National Lottery ticket. If I buy a ticket a week for a year, about how many winning tickets do I expect to get?</li>
<li>A doctor tells your uncle he has a 15% chance of a heart attack in the next 10 years. Out of 100 men like your uncle, how many would you expect to have a heart attack in the next 10 years?</li>
</ul><p><em>Going from expected frequencies to probabilities.<br /></em></p>
<p>In this case we need to make clear that a single case is representative of group. </p>
<ul><li>In Dumpsville, in past years it has typically rained on 6 days in June (which has 30 days). Assuming the climate has not changed, if I plan to visit Dumpsville next June, what is the probability the day will be dry?</li>
<li>Experience has shown out of every 100 racing cyclists, 20 will have been doping. If I pick a cyclist at random, what is the probability that he will be ‘clean’ (not doping)?</li>
<li>In a typical school with 80 Year 10 students, 64 of them will have a profile on the social media site Face-ache. What is the probability that if we pick a Year 10 student at random, they will not have a profile?</li>
</ul><h3>2. Comparisons of probabilities.<br /></h3>
<p>This involves comparison of two different situations, and can be represented using a pair of trees. It is ideal for dealing with challenging and realistic questions concerning relative and absolute risks. </p>
<p><em>Probabilities to expected frequencies<br /></em></p>
<ul><li>If I buy a ticket in Super Lottery, there is a 1% chance of winning something, while a ticket in the Duper Lottery has a 3% chance of winning a prize. If I intend to buy 100 tickets, how many more times will I win if I buy Duper tickets rather than Super tickets?</li>
<li>A newspaper headline says that eating radishes doubles your chance of getting Smith’s Disease. 1% of people who don’t eat radishes get Smith’s Disease anyway.
<ul><li>Out of 200 people not eating radishes, how many would I expect to get Smith’s disease? </li>
<li>Out of 200 people eating radishes, how many would I expect to get Smith’s disease? </li>
<li>How many people have to eat radishes, in order to get one extra case of Smith’s disease?</li>
</ul></li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/radishes.png" width="605" height="443" alt="radishes.png" /><p class="caption">Probability and expected frequency trees for people who eat and do not eat radishes
</p>
</div>
<p><em>Expected frequencies to probabilities<br /></em></p>
<ul><li>Typically it rains on 6 days in June (30 days). I am told that in September there is double the chance of raining on any day. What is the chance that it will rain on a random day in September? </li>
</ul><h3>3. Conditional and marginal probabilities. </h3>
<p>This requires two-level trees, and can also bring in two-way tables and Venn diagrams. First, give the conditional probabilities, set up the expected frequency tree, then can calculate the marginal expected frequencies and convert back to probabilities if wanted.</p>
<ul><li>
A weather forecast is generally right. When it forecasts ‘rain’, 90% of the time it rains. When it forecasts ‘no rain’, 70% of the time it does not rain. In a typical September they forecast rain on two-thirds days and no rain on one-third of days.
<ul><li> How many days would you expect it to rain each September? </li>
<li>What is the probability that a random day in September is not rainy? </li>
</ul></li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/rain-tree.png" width="605" height="361" alt="rain-tree.png" /><p class="caption">Probability and expected frequency trees for forecasting rain
</p>
</div>
<p>From the expected frequency tree, we expect it to rain on a total of 18+3=21 days in September, and not rain in 9. So the probability that a random day in September is not rainy is 9/30 = 0.3.</p>
<p>To get this result directly from the probabilities is not straightforward. </p>
<p>We can also represent the expected frequencies as a two-way table or a Venn diagram. </p>
<p> <img src="/sites/understandinguncertainty.org/files/rain-table.png" width="538" height="181" alt="rain-table.png" /></p>
<p><img src="/sites/understandinguncertainty.org/files/rain-square.png" width="479" height="389" alt="rain-square.png" /></p>
<p><img src="/sites/understandinguncertainty.org/files/rain-venn.png" width="376" height="312" alt="rain-venn.png" /></p>
<ul><li> A fair coin is flipped to decide whether your cricket team is going to bat first or second – heads you bat first, tails you bat second. If you bat first, your team wins 80% of the time. If you bat second, you win 50% of the time.
<ul><li>Out of 100 games, how many do you bat first in?</li>
<li>Out of 100 games, how many do you bat first, and then win?</li>
<li>Out of 100 games, how many do you win?</li>
<li>Before you flip the coin, what is the probability of you winning the game?</li>
</ul></li>
<li>100 students are suspected of cheating in an exam. They are wired up to a lie detector that will go ping! If it thinks you are lying. The people who make the detector claim that, if you are lying, there is a 90% chance the machine will go ping!. If you are genuinely not lying, there is a 10% chance the machine will get it wrong and go ping! Suppose 10 of the students have really been cheating. For how many students will the machine go ping!?
</li>
</ul><h3>4. Inverse probabilities. </h3>
<p>This is where things can get a bit tricky, but using expected frequency representations allows students to tackle some of the classic non-intuitive probability problems – essentially Bayes theorem. If they can do these, they have learnt a subtle and valuable skill. </p>
<ul><li>Weather forecasting: of the times it rains, what proportion did the forecast get it right?</li>
<li>
<ul><li>It rains 21 times, and in 18 the rain was forecast, so the proportion is 18/21 = 6/7: i.e. when it rains, there is 6/7 chance that the rain was forecast. Try doing that without using expected frequencies!!! Alternatively this is straightforward to read off the two-way table.</li>
</ul></li>
<li>Cricket: of the times you win your match, what proportion did you bat first?</li>
<li>Lie detector question – what is the chance, if the machine goes ‘ping!’, that the suspect has been cheating?</li>
</ul><h3>5. Using frequencies when teaching probability.<br /></h3>
<p>This is outlined by Jenny Gage and myself in our<a href="http://nrich.maths.org/probability"> NRich materials</a>, and in <a href="http://nrich.maths.org/content/id/9887/Gage,2012_ICME12.pdf">this paper</a>. The picture below shows part of the process of generating a two-way table by combining events represented by coloured bricks. From these empirical frequency distributions it is straightforward to go to expected frequency distributions, and hence to probabilities, using the process outlined above.</p>
<div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/cubes.jpg" width="314" height="234" alt="cubes.jpg" /><p class="caption">Results of experiments in which joint events are represented by pairs of coloured bricks</p>
</div>
<p>Using pairs of bricks to represent joint events: these can then be arranged as a two-way table, as above, or as a frequency tree.</p>
<p>Additional resources: </p>
<p>NRich materials<br /><a href="http://nrich.maths.org/probability">http://nrich.maths.org/probability</a></p>
<p>Jenny Gage paper at 1th ICME<br /><a href="http://nrich.maths.org/content/id/9887/Gage,2012_ICME12.pdf">http://nrich.maths.org/content/id/9887/Gage,2012_ICME12.pdf</a></p>
<p>Angela Fagerlin, Brian J. Zikmund-Fisher and Peter A. Ubel, Helping Patients Decide: Ten Steps to Better Risk Communication<br /><a href="http://jnci.oxfordjournals.org/content/103/19/1436.full.pdf+html">http://jnci.oxfordjournals.org/content/103/19/1436.full.pdf+html</a></p>
<p>Kurz-Milcke, E., Gigerenzer, G., & Martignon, L. (2008). Transparency in risk<br />
communication: Graphical and analog tools. Annals of the New York Academy of<br />
Sciences, 1128, 18–28.<br /><a href="http://library.mpib-berlin.mpg.de/ft/ek/EK_Transparency_2008.pdf">http://library.mpib-berlin.mpg.de/ft/ek/EK_Transparency_2008.pdf</a></p>
<p>Gigerenzer, G., & Hoffrage, U. (1995). How to Improve Bayesian Reasoning Without Instruction: Frequency Formats. Psychological Review, 102(4), 684-704.<br /><a href="http://library.mpib-berlin.mpg.de/ft/gg/GG_How_1995.pdf">http://library.mpib-berlin.mpg.de/ft/gg/GG_How_1995.pdf</a></p>
<p>Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz, L. M., & Woloshin, S. (2007). Helping doctors and patients make sense of health statistics. Psychological science in the public interest, 8(2), 53-96.<br /><a href="http://www.psychologicalscience.org/journals/pspi/pspi_8_2_article.pdf">http://www.psychologicalscience.org/journals/pspi/pspi_8_2_article.pdf</a></p>
<p>Use of natural frequencies and frequency trees in modern health communication – breast cancer screening leaflets<br /><a href="http://www.cancerscreening.nhs.uk/breastscreen/publications/ia-02.html">http://www.cancerscreening.nhs.uk/breastscreen/publications/ia-02.html</a></p>
</div></div></div>Sat, 13 Sep 2014 10:55:34 +0000david7749 at http://understandinguncertainty.orghttp://understandinguncertainty.org/using-expected-frequencies-when-teaching-probability#commentsAnother tragic cluster - but how surprised should we be?
http://understandinguncertainty.org/another-tragic-cluster-how-surprised-should-we-be
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Sadly another passenger plane crashed yesterday - the third in 8 days, the Air Algerie flight on July 24th, the TransAsia flight in Taiwan on July 23rd, and Malaysian Airlines in Ukraine on July 17th. Does this mean that flying is becoming more dangerous and we should keep off planes? The following analysis may appear cold-hearted, but is not intended to diminish the impact of this tragic loss on the people and families involved.</p>
<p>The <a href="http://www.planecrashinfo.com">Plane Crash Info</a> website contains the summaries of these three accidents - this site makes powerful reading and is not for those with a fear of flying. Their <a href="http://www.planecrashinfo.com/cause.htm">Statistics</a> page is full of useful information, including a graph showing a clear decline in the rate of accidents over the last 40 years: the 9/11 events in 2001 do not even make a blip in the graph.</p>
<p>However, it shows that flying can still carry some danger. 91 commercial flights containing 18 or more passengers have crashed in the previous 10 years (2004 to 2013), a rate of one every 40 days on average. So how surprising is it that 3 should happen in a space of 8 days?</p>
<p>A similar question was asked last November, when 6 cyclists were killed in London over 2 weeks, and Jody Aberdein and I <a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2013.00715.x/abstract">wrote a paper</a> on this: the methods are explained <a href="http://understandinguncertainty.org/when-cluster-real-cluster">here</a>. We can apply the same ideas to the 'cluster' of plane crashes, although of course this analysis is rather simplistic and ignores the undoubted variation in risk when flying in different parts of the world.</p>
<p>Consider any window of 8 days. If planes crash in an entirely unpredictable way at a rate of 91 over 10 years (3650 days), then we would expect 8 * 91/3650 = 0.2 crashes in any particular 8-day window. So assuming a Poisson distribution, the chance of at least 3 crashes in an 8-day window is around 1 in 1000 - very small indeed. So it is very surprising that there would be 3 or more crashes between July 17th and July 25th 2014.</p>
<p>But this is not the right question to ask. We should be concerned with whether such a 'cluster' is surprising over some period, say 10 years. In 10 years there are 456 non-over-lapping 'windows' of 8 days, and the chance that <em>at least one</em> of these contains at least 3 crashes = 1 - the chance that that <em>none</em> of them has at least three crashes = 1 - 0.999^456 = 0.41 (without rounding). And the more complex 'scan-statistic' adjustment, that allows for a sliding rather than non-overlapping windows, puts this chance up to 0.59.</p>
<p>So there is around a 6 in 10 chance that we should see such a large cluster over a 10-year period. In fact, as the graph below shows, the most likely maximum number of crashes of commercial planes with over 18 passengers in any 8-day window over 10 years is exactly ..... 3.<br /><img src="/sites/understandinguncertainty.org/files/plane-crash.png" width="673" height="498" alt="plane-crash.png" /></p>
<p>It is difficult to know how to interpret this - our emotions are rightly influenced by the awful nature of these events and the suffering they have caused. But personally, I hope it will make me no more nervous about flying than I am at the moment (and I have to admit I am not that keen to start with). </p>
<p>[Edit 10.02 July 25th: I had initially stated the adjusted probability stayed at 0.41: on checking the code I realised it changed to 0.59]</p>
</div></div></div>Fri, 25 Jul 2014 06:01:02 +0000david7679 at http://understandinguncertainty.orghttp://understandinguncertainty.org/another-tragic-cluster-how-surprised-should-we-be#commentsUsing metrics to assess research quality
http://understandinguncertainty.org/using-metrics-assess-research-quality
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The Higher Education Funding Council for England (HEFCE) is carrying out an <a href="http://www.hefce.ac.uk/whatwedo/rsrch/howfundr/metrics/">independent review of the role of metrics in research assessment,</a> and are encouraging views. I have submitted a (very personal) response, using HEFCE's suggested headings, which is given below in a minimally-edited version.</p>
<p>++++++++++++++++++++++++++++++</p>
<p>You will be getting a lot of detailed reasoned arguments about this topic, so I thought I would provide a more personal perspective from someone whose has done very well out of metrics.</p>
<h4><strong>Identifying useful metrics for research assessment:</strong><br /></h4>
<p>I am a statistician, and so I love metrics. I follow <a href="http://scholar.google.co.uk/citations?user=oz7MFu0AAAAJ&hl=en ">my Google Scholar profile</a> with interest. By any metric, I have been extremely successful. On Google Scholar I have 64000 citations, h-index of 85, and on Web of Science I have 32000 citations, h-index of 63. I follow Altmetrics, and have over 8000 followers on Twitter. All this has done me very well in my career - I have more letters after my name than you can shake a stick at.</p>
<p>Nevertheless I am strongly against the suggestion that peer–review can in any way be replaced by bibliometrics. </p>
<h4><strong>How should metrics be used in research assessment?</strong><br /></h4>
<p>My own experience shows some of the problems. My highest-cited paper clocks in at over 14,000, and yet it has roughly 150 authors and to be honest I have forgotten what, if anything, I contributed. How would these citations be shared out? Or the WinBUGS User manuals for software: around 5000 citations that do not even appear in WoS. Looking at my own record, I can see a correlation between metrics and the quality and importance of the work, but it is not large enough to use to replace judgement.</p>
<p>Clearly metrics should be collected and should be available to peers making judgements about the quality of research work. However they are only ‘indicators’, and not direct ‘measures’ of quality.</p>
<h4><strong>‘Gaming’ and strategic use of metrics</strong>:<br /></h4>
<p>I have done very well out of metrics, and although this is not because of deliberate gaming, I can see that my particular approach to research has paid off. I have tended to go for attractive and novel, even ‘sexy’ areas of statistics (believe it or not, such things do exist). I have got into a field early, not necessarily doing the best work, but reaping citation benefits later, mainly from people who have never read the original paper.</p>
<p>I have spent much of my career working on performance indicators in health and education, where it is finally being recognised that a past move towards apparently ‘simpler’ metrics was accompanied by massive gaming and distortions of practice. The Mid-Staffs scandal could be said to have directly arisen due to an obsession with a few indicators, at the cost of reduced attention to the whole system: fortunately judgements about hospitals have now moved away from a few targets and indicators to a more holistic system. </p>
<p>There has been a disastrous confusion between ‘indicators’ and ‘measures’, and it would be a retrograde step to see this being played out in research assessment. </p>
<h4><strong>Making comparisons:</strong><br /></h4>
<p>The difficulty with making comparisons is illustrated by the Google Scholar listing for researchers under <a href="http://scholar.google.co.uk/citations?view_op=search_authors&hl=en&mauthors=label:statistics">‘Statistics’. </a></p>
<p>I am currently lying 9th in the world, although I am fully aware that some people such as David Cox or Martin Bland do not feature. It is interesting to look at the top scorers – these include people who come from areas that I would not consider ‘statistics’, eg particle physics, or write tutorial articles for doctors, or have published in boundary areas such as machine learning. No doubt all these authors are excellent (although I am unsure about the individual who seems to have other people’s publications included under their own name), but this shows the problems of delineating a ‘subject’ in an automatic way. </p>
<p>To summarise, I feel that metrics should definitely be collected, but only used as additional evidence in a professional judgement as to the quality of research output.</p>
</div></div></div>Tue, 24 Jun 2014 10:23:25 +0000david7644 at http://understandinguncertainty.orghttp://understandinguncertainty.org/using-metrics-assess-research-quality#commentsNumbers and the common-sense bypass
http://understandinguncertainty.org/numbers-and-common-sense-bypass
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Yesterday the <a href="http://www.thesundaytimes.co.uk/sto/news/uk_news/Science/article1422837.ece?CMP=OTH-gnws-standard-2014_06_14">Sunday Times [paywall]</a> covered a talk Anne Johnson and I had given at the Cheltenham Science Festival about the statistics of sex, and the article said </p>
<blockquote><p>more people are having sex in their teens, roughly 30% before the age of 16.</p></blockquote>
<p>Let’s leave aside whether this is an accurate statistic or not, and simply look at what happened when the Daily Mail lifted this material into <a href="http://www.dailymail.co.uk/femail/article-2658291/Amorous-Brits-make-love-2-500-times-MINUTE-includes-time-sleep.html?ITO=1490&ns_mchannel=rss&ns_campaign=1490&utm_source=twitterfeed&utm_medium=twitter">an article of its own</a>. They made a number of errors, but the cracker was when the statement by the Sunday Times got turned into the remarkable headline: </p>
<blockquote><p>30 per cent of total sexual encounters take place before 16.</p></blockquote>
<p> And just in case they change their website, here is the evidence (4th bullet point):</p>
<p><img src="/sites/understandinguncertainty.org/files/mail-sex-quote.jpg" width="520" height="500" alt="mail-sex-quote.jpg" /></p>
<p>A little reflection should show that the Mail’s statement is more than implausible. 30% of all sex occurring before 16? Just think about it. The Daily Mail clearly didn't.</p>
<p>For those that would like some evidence, the article reports my estimate, based on the <a href="http://www.natsal.ac.uk">National Survey of Sexual Attitudes and Lifestyles</a> (NATSAL), that male+female couples in Britain have sex around 900,000,000 times a year. So if if 30% of this were in the under 16’s, that would be about 300,000,000 times a year. There are about 1,500,000 14 and 15 year olds, that’s 750,000 potential couples, so to get to this total they would all have to be having sex 400 times a year, which is more than once a day. No wonder they don’t have time for homework. Or maybe this number is just ridiculous.</p>
<p>This could be just an enthusiastic sub-editor, such as the one who produced the wonderful headline below<br /><img src="/sites/understandinguncertainty.org/files/bikers-kent_0.jpg" width="600" height="400" alt="bikers-kent.jpg" /></p>
<p> - this was rapidly changed to the more reasonable “<a href="http://www.courier.co.uk/Bikers-involved-Kent-road-accidents/story-18653710-detail/story.html">Bikers involved in more than one third of serious Kent road accidents</a>“, but not before someone had grabbed the previous version.</p>
<p>But in the Mail’s case it was not just the sub-editor who wrote the headline – the journalist made the claim in the article. So how can such an idiotic statement appear in a national newspaper? </p>
<p>This is not intended to be a standard ‘aren’t the media hopelessly innumerate’ bash, fun though those are. I am genuinely interested in how intelligent people can write such statements without seeming to engage their common sense (which I generously assume they have). </p>
<p>Perhaps the first thing to note is that the two errors above are both of an identical logical nature – the so-called ‘transposed conditional’. Let's consider some pairs of statements, the first of which is reasonable, the second is in error:</p>
<ul><li>30% of people, when they were aged under 16, had sex</li>
<li>30% of sex happens with people aged under 16</li>
</ul><ul><li>One third of fatal accidents involved motorcyclists</li>
<li>One third of motorcyclists have fatal accidents</li>
</ul><ul><li>90% of women with breast cancer get a positive mammography</li>
<li>90% of women with a positive mammography have breast cancer</li>
</ul><p>(In fact, <a href="http://www.informedchoiceaboutcancerscreening.org/wp-content/uploads/2013/05/Breast-screening-leaflet_8August2013_Final_ready-for-print-version.pdf"> the current breast screening leaflets</a> point out that fortunately only around 25% of women with a positive mammography have breast cancer).</p>
<p>In more abstract terms, what happens is that the “proportion of A that are also B”, is reported as “the proportion of B that are also A”. This is also known as the <a href="https://en.wikipedia.org/wiki/Prosecutor's_fallacy">Prosecutor’s Fallacy </a>, as it is a mistake made in legal cases. It is extremely dangerous to mix up the statements</p>
<ul><li>The probability of the evidence, if the suspect is innocent, is 1 in 1,000,000</li>
<li>The probability of the suspect being innocent, given this evidence, is 1 in 1,000,000</li>
</ul><p>and yet this mistake has happened repeatedly, if implicitly. </p>
<p><em>[See additional comment at the bottom of this article, added June 18th]<br /></em></p>
<p>One argument is that it is simple innumeracy: the so-called ‘deficit model’ explanation, that could be counteracted by better education in the mechanics of mathematics. But I am sure that most of us who make these kind of mistakes (and I am not excluding anyone here, including me) are functionally numerate, and could even have a stab at working out a 15% tip. </p>
<p>Another argument is that this really not an issue with numeracy, but a simple error in logic. And yet it might be reasonable to assume that a journalist, or a judge, would not confuse the following two statements </p>
<ul><li>All dogs are furry mammals with 4 legs</li>
<li>All furry mammals with 4 legs are dogs.</li>
</ul><p>So maybe it is something in the middle: an inability to combine reason with numbers, some kind of paralysis that comes when confronted with numerical arguments that means that ordinary common sense is bypassed. I see this when tutoring young people for GCSE maths: intelligent kids who when asked to do some maths couched as a pseudo-real-world problem, (“<em>Fred travels at 50 mph for 30 minutes, how far does he go?</em>”) go into a mental panic, start using formulae at random, and come up, like Baldrick doing mental arithmetic, with some absurd answer (“<em>1500 miles</em>”). And yet if I asked them the same problem in the real-real-world, and did not say it was maths, they would be able to get the answer by using some basic reasoning (“<em>25 miles</em>”). The kind of maths teaching promoted by Tim Gowers on <a href="http://gowers.wordpress.com/2012/06/08/how-should-mathematics-be-taught-to-non-mathematicians/">how maths should be taught to non-mathematicians</a> seeks to avoid this ‘<em>find the formula and plug in the numbers</em>’ style, and with luck the <a href="https://www.gov.uk/government/publications/16-to-18-core-maths-qualifications">Core Maths curriculum</a> will feature more reasoning with practical situations.</p>
<p>One consequence of this inability to take a sensible critical attitude to numbers is that opinions are pushed to the extremes: numbers are to be either accepted, and even fetishised, as some sort of God-given truth, or rejected out of hand as ‘just statistics’. Possibly in the same breath. Just listen to the Today programme or Question Time.</p>
<p>Of course there are other areas in which common sense is bypassed - when we may be only too willing to suspend our normal powers of criticism and warmly embrace delusion. These include claims for alternative therapies, arguments by populist politicians, optimistic prognoses for desperately ill loved-ones, or bigging up England’s performance in the World Cup. Sadly, in all these cases some realism may be more appropriate.</p>
<h3>Additional comment added June 18th<br /></h3>
<p>An equivalent way to view this error is in terms of the 'wrong denominator': is the 30% a proportion of people, or of all sexual activity? Gerd Gigerenzer emphasises that these mistakes are due to not being clear about the 'reference class', i.e. 30% of what?. Ambiguity can be avoided by always making the class clear by saying, for example, "Out of every 100 people reaching 16, 30 have already had sex".</p>
</div></div></div>Mon, 16 Jun 2014 07:18:43 +0000david7621 at http://understandinguncertainty.orghttp://understandinguncertainty.org/numbers-and-common-sense-bypass#commentsA heuristic for sorting science stories in the news
http://understandinguncertainty.org/heuristic-sorting-science-stories-news-0
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p> Dominic Lawson's <a href="http://www.thesundaytimes.co.uk/sto/comment/columns/dominiclawson/article1417214.ece ">article in the Sunday Times today</a>[paywall] quotes me as having the rather cynical heuristic: "<em>the very fact that a piece of health research appears in the papers indicates that it is nonsense</em>." I stand by this, but after a bit more consideration I would like to suggest a slightly more refined version for dealing with science stories in the news, particularly medical ones.</p>
<blockquote><p>"Ask yourself: if the study had come up with a negative result, would I be hearing about it? If NO, then don't bother to read or listen to the story"
</p></blockquote>
<p>The immediate impulse behind Lawson's article was a spate of studies claiming associations between ordinary daily habits and future bad outcomes: <a href="http://www.telegraph.co.uk/health/healthnews/10862512/Three-slices-of-white-bread-a-day-linked-to-obesity.html">eating a lot of white bread with becoming obese</a>, <a href="http://www.bbc.co.uk/news/health-27603587">being cynical with getting dementia</a>, <a href="http://www.bbc.co.uk/news/health-27617615">light bedrooms with obesity (again)</a>. All these stories associate mundane exposures with later developing dread outcomes,<em> i.e.</em> the classic '<a href="http://www.dailymail.co.uk/health/article-2019170/Can-cat-cancer-Parasite-bellies-linked-brain-tumours.html">cats cause cancer</a>' type. My argument is that, since we would not be reading about a study in which these associations had <em>not</em> been found, we should take no notice of these claims.</p>
<p>Why my cynicism? There has been a lot of public discussion of potential biases in the published scientific literature – see for example, commentaries in the <a href="http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-degree-it-not-trouble ">Economist</a> and <a href="http://www.forbes.com/sites/henrymiller/2014/01/08/the-trouble-with-scientific-research-today-a-lot-thats-published-is-junk/ ">Forbes magazine</a>. The general idea is that by the time research has been selected to be submitted, and then selected for publication, there is a good chance the results are false positives: for a good review of the evidence for this see <em><a href="http://simplystatistics.org/2013/12/16/a-summary-of-the-evidence-that-most-published-research-is-false/ ">‘A summary of the evidence that most published research is false’</a></em>. There is also an excellent <a href="http://deevybee.blogspot.co.uk/2014/01/why-does-so-much-research-go-unpublished.html ">blog by Dorothy Bishop</a> on why so much research goes unpublished.</p>
<p>The point of this blog is to argue that such selection bias is as nothing compared to the hurdles overcome by stories that are not only published, but <em>publicised</em>. For a study to be publicised, it must have</p>
<p>• Been considered worthwhile to write up and submit to a journal or other outlet<br />
• Have been accepted for publication by the referees and editors<br />
• Been considered ‘newsworthy’ enough to deserve a press release<br />
• Been sexy enough to attract a journalist’s interest<br />
• Got past an editor of a newspaper or newsroom.</p>
<p>Anything that gets through all these hurdles stands a huge chance of being a freak finding. In fact, if the coverage is on the radio, I recommend sticking your fingers in your ears and loudly saying ‘la-la-la’ to yourself.</p>
<p>The crucial idea is that since there is an unknown amount of evidence that I am not hearing about and that would contradict this story, there is no point in paying attention to whatever it is claiming. It is like watching a video of a football team scoring goals, and then suddenly realising that you are only being shown the 'successes' and not the ones they let in: the evidence just shows that they are capable of scoring, but not whether they score more than they concede. So, if you're interested in assessing the quality of the team, stop watching the video [of course if you just enjoy the spectacle, carry on].</p>
<p>The heuristic is even more appropriate when you hear or read of any survey by any organisation, particularly charities.</p>
<p>This all may seem rather cynical, and keep in mind that I am a grumpy old git (although now trying to avoid cynicism, as I have no wish to become demented). But just think of the time you can save.</p>
<p>[Added 2nd June 2014: I should have made clear that I am only talking about <em>single</em> studies: proper reviews of the totality of evidence should be listened to. So this is not an excuse to ignore evidence connecting smoking and lung cancer.]</p>
<p>PS A recent study argues that <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0085355 ">newspapers preferentially cover medical research with weaker methodology</a>. However I must apply my own heuristic to this: would I have heard about it if the researchers had found out the opposite? And you should ask yourself, would I be telling you about it?</p>
<p>PPS I have been struggling to find a suitable name for this heuristic, perhaps with some literary or classical allusion to someone who was misled by only being told selected items of information. Perhaps the ‘Siddhartha’ heuristic? Siddhārtha Gautama was a prince who was only told good news, and protected from seeing suffering and death. But he finally realised that he was not seeing the world as it really was, and so he left his palace to first take on the life as a wandering ascetic, and eventually to become the Buddha. </p>
</div></div></div>Sun, 01 Jun 2014 11:23:42 +0000david7593 at http://understandinguncertainty.orghttp://understandinguncertainty.org/heuristic-sorting-science-stories-news-0#commentsIt's cherry-picking time: more poorly reported science being peddled to journalists
http://understandinguncertainty.org/its-cherry-picking-time-more-poorly-reported-science-being-peddled-journalists
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Yesterday the <a href="http://www.dailymail.co.uk/health/article-2582774/TV-computer-games-wreck-family-life-leave-child-obese-study-warns.html">Daily Mail</a> trumpeted “<em>For every hour of screen time, the risk of family life being disrupted and children having poorer emotional wellbeing may be doubled</em>”, while the Daily Telegraph said that <em>"for every hour each day a child spent in front of a screen, the chance of becoming depressed, anxious or being bullied rose by up to 100 per cent”</em>. These dramatic conclusions come from a <a href="http://archpedi.jamanetwork.com/article.aspx?articleid=1844044">study whose abstract states</a> – </p>
<blockquote><p>Across associations, the likelihood of adverse outcomes in children ranged from a 1.2- to 2.0-fold increase for emotional problems and poorer family functioning for each additional hour of television viewing or e-game/computer use depending on the outcome examined.</p></blockquote>
<p>Unfortunately this is, pure and simply, wrong. And the articles in the papers are misleading too - although note the use of ‘<em>may be doubled</em>’ from the Mail, and ‘<em>up to 100 per cent</em>’ from the Telegraph, which appears to allow them to cherry-pick their evidence as much as they want. So, leaving aside design and analysis issues such as needlessly breaking outcome scales into ‘high’ and ‘low’, what is wrong with the reporting in the paper? </p>
<p>Table 3 of the paper is reproduced below. It shows 96 estimates with 95% confidence intervals. The authors focus on reporting the results that are ‘significantly high’, that is whose 95% intervals lie above 1, of which there are 11, shown in bold and scattered rather haphazardly across the Table.</p>
<p><img src="/sites/understandinguncertainty.org/files/screentime.jpg" width="875" height="792" alt="screentime.jpg" /></p>
<p>But there also appear to be 2 ‘significantly low’ odds ratios, whose 95% intervals lie below 1, and 83 odds ratios which are not significantly different from 1. In fact the distribution of odds ratios forms a distribution around 1: apart from one odds ratio of 2 (which is very imprecise, with an interval from 1 to 4), they all lie between 0.7 and 1.3. </p>
<p>Out of 96 95%-intervals, we would expect around 5 to exclude 1 by chance alone, even if there were no effect. In fact there were 13, suggesting the possibility of a small overall effect, but nowhere near the ‘doubling’ claimed. One would also have to believe that all the many confounding factors that would simultaneously influence TV watching and later behaviour had been fully accounted for. Which is extremely doubtful. Maybe watching lots of TV when young does contribute to later problems – it seems quite plausible – but this study does not show it.</p>
<p>The crucial insight is that the estimated odds ratios for adverse outcomes range between 0.7 and 2, and not 1.2 and 2 as claimed by the authors in their abstract. Focusing on only the ‘significant’ positive results is, either deliberately or through ignorance, very poor and deeply misleading science. It also shows dismal refereeing. In the text the authors acknowledge that "Few associations were evident", but this does not make it through to the abstract.</p>
<p>Journalists would not have noticed this paper unless it had been press-released by the academic journal. So when you read about some poor science, don’t jump to blame the journalists: it could well be because of the efforts of some scientists, institutions and journals to promote coverage of their activities, regardless of their true quality and importance. Sadly, this behaviour harms scientific credibility.</p>
<p>Postscript<br />
There is also a potential technical problem in that the authors interpret an odds ratio of 2 as doubling the ‘likelihood’. But if an outcome measures has a base-rate of around 50%, or odds of 1:1, an odds ratio of 2 multiplies the odds up to 2:1, or 66%. So the risk goes up from 50% to 66%, but is not doubled. In fact in the case of the 'emotional problems' scale the baseline risk is around 11%, and so an odds ratio of 2 does roughly double the risk.</p>
</div></div></div>Tue, 18 Mar 2014 21:29:05 +0000david7508 at http://understandinguncertainty.orghttp://understandinguncertainty.org/its-cherry-picking-time-more-poorly-reported-science-being-peddled-journalists#commentsMore deaths due to climate change? Or maybe not.
http://understandinguncertainty.org/more-deaths-due-climate-change-or-maybe-not
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Coverage of a <a href="http://jech.bmj.com/content/early/2014/01/08/jech-2013-202449.short?g=w_jech_ahead_tab">paper</a> just published by Journal of Epidemiology and Community Health included dramatic headlines such as the Guardian's <a href="http://www.theguardian.com/environment/2014/feb/04/heat-related-deaths-climate-change">Heat-related deaths in the UK will rise 257% by 2050 because of climate change</a>. But a closer look at the numbers in the paper paints a rather different picture.</p>
<p>Figure 4 of the paper shows the number of deaths expected per 100,000 people in each category, and how the authors estimate this will change into the 2080s. </p>
<p><img src="/sites/understandinguncertainty.org/files/climate-temp-deaths_0.png" width="660" height="264" alt="climate-temp-deaths_0.png" /></p>
<p>But the vertical axes for the two plots are different, and they should perhaps have been drawn like this.</p>
<p><img src="/sites/understandinguncertainty.org/files/climate-temp-R.jpeg" width="673" height="498" alt="climate-temp-R.jpeg" /></p>
<p>Or even added in a 'combined plot' [added 6th February 2013]</p>
<p><img src="/sites/understandinguncertainty.org/files/climate-deaths.jpeg" width="673" height="498" alt="climate-deaths.jpeg" /></p>
<p>This clearly reveals that, in terms of rate per 100,000, the decline in cold-related death rate easily outweighs the increase in the heat-related death rate. So overall, for any individual in the UK, the risk of a temperature-related death is expected to fall steadily due to climate change. Bring it on! </p>
<p>But since there are going to be more old people in the future, the absolute numbers of deaths is going to increase - and this number was emphasised by the authors and got the headlines.</p>
<p>The abstract of the paper includes the phrase <em>"The increased number of future temperature-related deaths was partly driven by projected population growth and ageing." </em> According to the projections in the paper, if the population make-up did not change, the overall mortality risk would go down. So it would have been more accurate to say <em>"The increased number of future temperature-related deaths was <strong>wholly</strong> driven by projected population growth and ageing."</em>. </p>
<p>But that is clearly not the message that the authors wanted to convey. It is unfortunate that this kind of presentation gives ammunition to those who say that the effects of climate change are being exaggerated.</p>
</div></div></div>Tue, 04 Feb 2014 11:08:52 +0000david7444 at http://understandinguncertainty.orghttp://understandinguncertainty.org/more-deaths-due-climate-change-or-maybe-not#commentsHow surprising was the cluster of cycle deaths in London?
http://understandinguncertainty.org/how-surprising-was-cluster-cycle-deaths-london
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p><a href="http://www.bbc.co.uk/programmes/b03mjcbk">More or Less </a>recently featured Jody Aberdein talking about the cluster of 6 cycle deaths in London over a 2 week period. </p>
<p>The paper with the details of the analysis can, for a while, be freely obtained from <a href="Http://onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2013.00715.pdf">Significance magazine</a>.</p>
<p>Details of the statistical methods are given <a href="http://understandinguncertainty.org/when-cluster-real-cluster">here</a> - these are necessarily quite complex due to the need to allow for all possible 2 week periods. </p>
</div></div></div>Sat, 04 Jan 2014 06:43:41 +0000david7382 at http://understandinguncertainty.orghttp://understandinguncertainty.org/how-surprising-was-cluster-cycle-deaths-london#commentsPISA statistical methods - more detailed comments
http://understandinguncertainty.org/pisa-statistical-methods-more-detailed-comments
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>In the Radio 4 documentary <em><a href="http://www.bbc.co.uk/iplayer/episode/b03j9mx2/PISA_Global_Education_Tables_Tested/ ">PISA - Global Education Tables Tested</a></em>, broadcast on November 25th, a comment is made that the statistical issues are a bit complex to go into. Here is a brief summary of my personal concerns: to get an idea of the feelings about PISA statistical methods, see for example an <a href="http://www.tes.co.uk/article.aspx?storycode=6344672">article</a> in the Times Educational Supplement, and the <a href="http://www.tes.co.uk/article.aspx?storycode=6345213">response</a> by OECD. </p>
<p>The PISA methodology is complex and rather opaque, in spite of the substantial amount of material published in the technical reports. Briefly:</p>
<ol><li>Individual students only answer a minority of questions.
</li>
<li>Multiple ‘plausible values’ are then generated for all students assuming a particular statistical model, essentially estimating what might have happened if the student had answered all the questions.
</li>
<li>These ‘plausible values’ are then treated as if they are the results of complete surveys, and form the basis of national scores (and their uncertainties) and hence rankings in league tables.
</li>
<li>But the statistical model used to generate the ‘plausible scores’ is demonstrably inadequate – it does not fit the observed data.
</li>
<li>This means the variability in the plausible scores is underestimated, which in turn means the uncertainty in the national scores is underestimated, and hence the rankings are even less reliable than claimed.
</li>
</ol><p>Here's a little more detail on these steps.</p>
<h3>1. Individual students only answer a minority of questions.<br /></h3>
<p>Svend Kreiner has <a href="http://www.tes.co.uk/article.aspx?storycode=6344672">calculated</a> that in 2006, about half did not answer any reading questions at all, while <em><em>"another 40 per cent of participating students were tested on just 14 of the 28 reading questions used in the assessment. So only approximately 10 per cent of the students who took part in Pisa were tested on all 28 reading questions."</em></em></p>
<h3>2. Multiple ‘plausible values’ are then generated for all students assuming a particular statistical model<br /></h3>
<p>A simple Rasch model (<a href="http://www.oecd.org/edu/school/programmeforinternationalstudentassessmentpisa/pisa2009technicalreport.htm ">PISA Technical Report </a>, Chapter 9) is assumed, and five values for each student are generated at random from the 'posterior' distribution given the information available on that student. So for the half of students in 2006 who did not answer any reading questions, five 'plausible' reading scores are generated on the basis of their responses on other subjects.</p>
<h3>3. These ‘plausible values’ are then treated as if they are the results of surveys with complete data on all students<br /></h3>
<p>The Technical Report is not clear about how the final country scores are derived, but the <a href="http://www.oecd.org/pisa/pisaproducts/pisadataanalysismanualspssandsassecondedition.htm ">Data Analysis manual</a> makes clear that these are based on the five plausible values generated for each student: they then use standard methods to inflate the sampling error to allow for the use of 'imputed' data.</p>
<blockquote><p>“Secondly, PISA uses imputation methods, denoted plausible values, for reporting student performance. From a theoretical point of view, any analysis that involves student performance estimates should be analysed five times and results should be aggregated to obtain: (i) the final estimate; and (ii) the imputation error that will be combined with the sampling error in order to reflect the test unreliability on the standard error.</p>
<p>All results published in the OECD initial and thematic reports have been computed accordingly to these methodologies, which means that the reporting of a country mean estimate and its respective standard error requires the computation of 405 means as described in detail in the next sections.”</p></blockquote>
<p>There does seem to be some confusion in the PISA team about this - in my interview with Andreas Schleicher, I explicitly asked whether the country scores were based on the 'plausible values', and he appeared to deny that this was the case.</p>
<h3>4. The statistical model used to generate the ‘plausible scores’ is demonstrably inadequate.<br /></h3>
<p>Analysis using imputed ('plausible') data is not inherently unsound, provided (as PISA do) the extra sampling error is taken into account. But the vital issue is that the adjustment for imputation is only valid if the model used to generate the plausible values can be considered 'true', in the sense that the generated values are reasonably 'plausible' assessments of what that student would have scored had they answered the questions. </p>
<p>A simple Rasch model is assumed by PISA, in which questions are assumed to have a common level of difficulty across all countries - questions with clear differences are weeded out as “dodgy”. But in a<a href="http://link.springer.com/article/10.1007%2Fs11336-013-9347-z"> paper in Psychometrika</a>, Kreiner has shown the existence of substantial Differential Item Functioning” (DIF) - i.e. questions have different difficulty in different countries, and concludes that the <em>“The evidence against the Rasch model is overwhelming.”</em></p>
<p> The existence of DIF is acknowledged by <a href="http://www.oecd.org/pisa/47681954.pdf ">Adams</a> (who heads the OECD analysis team), who says <em>“The sample sizes in PISA are such that the fit of any scaling model, particularly a simple model like the Rasch model, will be rejected. PISA has taken the view that it is unreasonable to adopt a slavish devotion to tests of statistical significance concerning fit to a scaling model.”</em>. Kreiner disagrees, and argues that the effects are both statistically significant and practically important.</p>
<h3>5. This means the variability in the plausible scores is underestimated<br /></h3>
<p>The crucial issue, in my view, is that since these 'plausible values' are generated from an over-simplified model, they will not represent plausible values as if the student really had answered all the questions. <a href="http://link.springer.com/article/10.1007/s11336-013-9347-z ">Kreiner</a> says <em>“The effect of using plausible values generated by a flawed model is unknown”.</em></p>
<p><em>[The next para was in the original blog, but I have revised my opinion since - see note below]</em> I would be more confident than this, and would expect that the 'plausible values' will be ‘under-dispersed’, ie not show a reasonable variability. Hence the uncertainty about all the derived statistics, such as mean country scores, will be under-estimated, although the extent of this under-estimation is unknown. It is notable that PISA acknowledge the uncertainty about their rankings (although this is not very prominent in their main <a href="http://www.oecd.org/pisa/46643496.pdf">communications</a>), but the extra variability due to the use of potentally-inappropriate plausible values will inevitably mean that the rankings would be even less reliable than claimed. That is the reason for my scepticism about PISA's detailed rankings.</p>
<h3>Note added 30th November:<br /></h3>
<p>I acknowledge that plausible values derived from an incorrect model should, if analysed assuming that model, lead to exactly the same conclusions than if they had not been generated in the first place (and, say, a standard maximum likelihood analysis carried out). Which could make one ask - why generate plausible values in the first place? But in this case it is convenient for PISA to have ‘complete response’ data to apply their complex survey weighting schemes for their final analyses. </p>
<p>But this is the issue: it is unclear what effect generating a substantial amount of imputed data from a simplistic model will have, when those imputed data are then fed through additional analyses. So after more reflection I am not so confident that the PISA methods lead to an under-estimate of the uncertainty associated with the country scores: instead I agree with Svend Kreiner’s view that it is not possible to predict the effect of basing subsequent detailed analysis on plausible values from a flawed model.</p>
</div></div></div>Mon, 25 Nov 2013 17:30:05 +0000david7301 at http://understandinguncertainty.orghttp://understandinguncertainty.org/pisa-statistical-methods-more-detailed-comments#commentsComplaint about the Press Complaints Commission
http://understandinguncertainty.org/complaint-about-press-complaints-commission
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>What a strange organisation the Press Complaints Commission (PCC) is. They say that a press article is inaccurate, but consider it reasonable that the inaccurate headline remains uncorrected.</p>
<h3>Brief Timeline<br /></h3>
<ul><li>12th july 2013. <a href="http://www.nhs.uk/NHSEngland/bruce-keogh-review/Pages/published-reports.aspx">Keogh report on 14 hospitals</a> is due out. Professor Sir Brian Jarman provides data and <a href="https://www.dropbox.com/s/9o4caplpvcp1e6r/My%20email%20to%20Laura%20Donnelly%20%28data%20attachments%20removed%29.doc">briefs journalists </a>on above-average deaths in hospitals being investigated. He emphasizes that such deaths cannot be interpreted as ‘avoidable’.
</li>
<li>13th July. The <a href="http://www.telegraph.co.uk/health/heal-our-hospitals/10178296/13000-died-needlessly-at-14-worst-NHS-trusts.html">Sunday Telegraph leads with headline</a> ‘13,000 died needlessly at 14 worst NHS trusts’, in conflict with both Jarman’s advice, and what is said in the article itself.
</li>
<li>16th july 2013. Keogh review published, and explicitly states that<em> “It is clinically meaningless and academically reckless to use such statistical measures to quantify actual numbers of avoidable deaths.</em> Numerous criticisms of Telegraph coverage follow, including an article by me in the <a href="http://press.psprings.co.uk/bmj/august/needless.pdf">British Medical Journal</a>. A number of complaints are made to the PCC.
</li>
<li>1st November 2013. PCC finally announces that <em>“By attributing the number of “needless” deaths to a calculation made by Sir Brian Jarman, the newspaper had failed to take care not to publish inaccurate information in breach of Clause 1 (i). As such, a correction – published promptly and with due prominence – was required in accordance with the terms of Clause 1 (ii).”</em> This does not appear publicly. The Telegraph publishes a ‘clarification’, but the headline remains.
</li>
<li>4th November . I complain to PCC that the misleading Telegraph headline remains on their article, but am told that <em>“Taken in context with the article as a whole, and in light of the additional footnote, the Commission did not consider that a significantly misleading impression of the investigation’s findings had been created by the headline.”</em>. I find it very difficult to understand how they can come to this bizarre and illogical conclusion.
</li>
</ul><p>Another complainant has taken this to the independent reviewer of the PCC. But I am deeply unimpressed by the PCC’s feeble response to this ‘inaccurate’ (to be extremely generous) article. </p>
<p>Presumably the PCC will soon be abolished, and we can only hope that post-Leveson there will be a more effective body. But I haven’t put the bunting out yet.</p>
<p>PS<br />
18th November. Another grossly misleading headline, this time in the<a href="http://www.dailymail.co.uk/news/article-2509629/Decade-Labour-saw-50-000-die-hospital.html#ixzz2lO5x9HBp "> Daily Mail</a>, “ <em>Decade of Labour 'saw 50,000 too many die in hospital' </em>“. They put the inaccurate statement in quotes, as if someone has actually claimed this. But nobody said it - this quote is purely a product of the imagination of the sub-editors. </p>
</div></div></div>Sun, 24 Nov 2013 12:57:30 +0000david7293 at http://understandinguncertainty.orghttp://understandinguncertainty.org/complaint-about-press-complaints-commission#commentsPress Complaints Commission decide '13,000 needless deaths' story was inaccurate
http://understandinguncertainty.org/press-complaints-commission-decide-13000-needless-deaths-story-was-inaccurate
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>I was of a number of complainants to the Press Complaints Commission about the Sunday Telegraph story headlined <a href="http://www.telegraph.co.uk/health/heal-our-hospitals/10178296/13000-died-needlessly-at-14-worst-NHS-trusts.html "><em>13,000 died needlessly at 14 worst NHS trusts</em></a>, as the Telegraph journalists had been explicitly told by the originator of the figures, Professor Brian Jarman, that this was an inappropriate interpretation. My objections were expressed in an <a href="http://understandinguncertainty.org/files/2013bmj-needless.pdf">article in the British Medical Journal</a>. </p>
<p>The Press Complaints Commission has now told me that <em>“The Commission decided that the Sunday Telegraph had published significantly misleading information; however it had offered to take sufficient action to remedy the breach of the Code as required under the terms of Clause 1 (ii).”</em></p>
<p>This means that there is no official adjudication and no publication of the decision – this seems strange, and so I have reproduced below (with permission of the PCC) their decision.</p>
<p>The crucial finding was <em>“By attributing the number of “needless” deaths to a calculation made by Sir Brian Jarman, the newspaper had failed to take care not to publish inaccurate information in breach of Clause 1 (i). As such, a correction – published promptly and with due prominence – was required in accordance with the terms of Clause 1 (ii).”</em></p>
<p>Having published this inaccurate information, the Sunday Telegraph published some clarifications. <a href="http://www.telegraph.co.uk/health/heal-our-hospitals/10178296/13000-died-needlessly-at-14-worst-NHS-trusts.html ">The online version</a> does now has the 'clarification' at the bottom, but as the image below shows, still had [on November 4th] the inaccurate headline. Extraordinary.</p>
<p> <img src="/sites/understandinguncertainty.org/files/13000-headline.png" width="652" height="205" alt="13000-headline.png" /></p>
<blockquote><h3>Commission’s decision in the case of<br />
Groves v The Daily Telegraph/ The Sunday Telegraph</h3>
<p>The complainant considered that the newspapers had inaccurately reported that an investigation, overseen by Sir Bruce Keogh, had revealed that thousands of patients had died “needlessly” at 14 NHS hospital trusts. </p>
<p>Clause 1 (i) (Accuracy) of the Editors’ Code of Practice states that “the press must take care not to publish inaccurate, misleading or distorted information”. Clause 1 (ii) makes clear that “a significant inaccuracy, misleading statement or distortion once recognised must be corrected promptly and with due prominence”.</p>
<p>The Commission noted that Sir Bruce Keogh’s investigation into 14 NHS hospital trusts had been preceded by investigations into the Mid Staffordshire NHS Foundation Trust. These investigations – overseen by the Healthcare Commission and, more recently, by Robert Francis QC – had been launched due to the trust’s above-average Hospital Standard Mortality Ratio (HSMR), mortality statistics calculated by Sir Brian Jarman, Director of the Dr Foster Intelligence Unit. In light of Robert Francis’s conclusions at Mid Staffordshire, the Health Secretary and the Prime Minister instructed Sir Bruce to carry out a review of an additional 14 hospital trusts with “persistently high mortality rates”.</p>
<p>The Commission noted the complainant’s concern that the Daily Telegraph had misleadingly stated that “an investigation found that thousands of patients died needlessly because of poor care” and the Sunday Telegraph had inaccurately said “in total Sir Brian [Jarman] calculated that up to 13,000 patients died needlessly”. Indeed, in reference to HSMR and SHMI (Summary Hospital-level Mortality) statistics, Sir Bruce Keogh had said in his report that “it is clinically meaningless and academically reckless to use such statistical measures to quantify actual numbers of avoidable deaths”. He had also quoted Robert Francis QC, who had said “it is in my view misleading and a potential misuse of the figures to extrapolate from them a conclusion that any particular number, or range of numbers, of deaths were caused or contributed to by inadequate care”. </p>
<p>However, as also stated in the Keogh report, the Health Secretary and the Prime Minister had instructed Sir Bruce to carry out this review with the rationale that “high mortality rates at Mid Staffordshire NHS Foundation Trust were associated with failures in all three dimensions of quality – clinical effectiveness, patient experience, and safety – as well as failures in professionalism, leadership and governance”. Although the “excess deaths” had not been described as “needless” by Sir Bruce Keogh or Sir Brian Jarman, the newspapers had been entitled to their interpretation of the investigation’s results. </p>
<p>The Commission noted that when presenting complex statistical information to non-specialist readers, newspapers will inevitably have to summarise information. The Code does not require the publication of exhaustive information. However, the Commission made clear that it is essential that newspapers interpret such statistical information accurately, and in a manner which is not misleading. In this instance, it was for the Commission to consider whether, in the context of each article as a whole, the newspapers had made clear that the quoted numbers related to statistical analysis of above-average death rates; they did not reflect the outcome of a study into the causes of individual deaths.</p>
<p>However, in the Sunday Telegraph’s article, the newspaper had stated that Sir Brian Jarman had “calculated that up to 13,000 patients died needlessly”. In fact, Sir Brian had not calculated the number of “needless” deaths; rather, he had calculated the number of deaths over and above what would have been expected. Indeed, as previously noted, Sir Bruce Keogh had warned against using HSMR statistics “to quantify actual numbers of avoidable deaths”. By attributing the number of “needless” deaths to a calculation made by Sir Brian Jarman, the newspaper had failed to take care not to publish inaccurate information in breach of Clause 1 (i). As such, a correction – published promptly and with due prominence – was required in accordance with the terms of Clause 1 (ii).</p>
<p>The newspaper had offered to amend the online version of its article so that the sentence “[I]n total Sir Brian calculated that up to 13,000 patients died needlessly in that period” was replaced by “[I]n total Sir Brian calculated that up to 13,000 more patients died in that period than would have been statistically expected”. It had also offered to append the following note:</p>
<p>Clarification<br />
We have been asked to make clear that, contrary to an earlier version of this report, Sir Brian Jarman’s findings reflected the number by which mortality figures exceeded what would have been statistically expected. He made no finding as to the causes of any deaths or whether they were “needless”.</p>
<p>In addition, the newspaper had offered to publish the following correction on page two of the newspaper:</p>
<p>Clarification<br />
Following our July 14 report “13,000 died needlessly at 14 worst NHS trusts” we have been asked to make clear that Sir Brian Jarman’s findings reflected the number by which mortality figures exceeded what would have been statistically expected. He made no finding as to the causes of any deaths or whether they were “needless”.</p>
<p>The Commission noted that the complainant considered that the newspaper should also amend the article’s headline and the reference to “up to 1,200” patients dying needlessly at Stafford Hospital. He had also requested that the newspaper refrains from using the word “needless” in relation to HSMR statistics in future. However, the Commission reiterated that the newspaper had been entitled to its interpretation of the results of both the Keogh and Francis investigations. Furthermore, the first line of the piece had made clear that the 13,000 deaths related to "excess deaths” since 2005. Taken in context with the article as a whole, and in light of the additional footnote, the Commission did not consider that a significantly misleading impression of the investigation’s findings had been created by the headline. The suggested amendment and correction had addressed the key point: Sir Brian Jarman’s calculation did not reflect the outcome of a study into the causes of individual deaths. As such, the Commission was satisfied that the newspaper had offered to take sufficient action to meet its obligations under Clause 1 (ii), and it instructed the newspaper to amend the article and to publish the correction without delay in order to set the record straight.</p>
<p>The Commission then turned to consider the Daily Telegraph article, headlined “NHS inquiry: Shaming of health service as care crisis is laid bare”. In this instance, the newspaper had not given a specific number of “needless deaths”. It had said that “an investigation found that thousands of patients died needlessly because of poor care”. It had also stated that the selected hospitals had been those with the “highest recent mortality rates”. Furthermore, the newspaper had taken care to refer to the mortality statistics as “excess deaths” and it had quoted Health Secretary Jeremy Hunt as having said “no statistics are perfect but mortality rates suggest that since 2005 thousands more people may have died than would normally be expected at the 14 trusts reviewed”. In addition, the newspaper had provided anecdotal evidence of the poor care that had been identified: “some risks to patients so severe that [inspectors] were forced to step in immediately”; “decisions were taken urgently to close operating theatres, [and to] suspend unsafe ‘out of hours’ services for critically ill patients”. In the print version, this piece had also been presented alongside the findings related to the individual trusts and had clearly identified the number of “excess deaths” attributed to each one. In this instance, the Commission was satisfied that the newspaper had not given the significantly misleading impression that the Keogh investigation had examined the causes of individual deaths. The newspaper had provided adequate statistical context for its assertion regarding the numbers of “needless” deaths and therefore the basis for the newspaper’s interpretation of the relationship between mortality statistics and the level of care provided by the 14 NHS hospital trusts had been clear. As such, no correction was required and this piece did not raise a breach of the Code.</p>
</blockquote>
</div></div></div><div class="field field-name-upload field-type-file field-label-hidden"><div class="field-items"><div class="field-item even"><table class="sticky-enabled">
<thead><tr><th>Attachment</th><th>Size</th> </tr></thead>
<tbody>
<tr class="odd"><td><span class="file"><img class="file-icon" alt="" title="application/pdf" src="/modules/file/icons/application-pdf.png" /> <a href="http://understandinguncertainty.org/sites/understandinguncertainty.org/files/2013bmj-needless.pdf" type="application/pdf; length=207688">2013bmj-needless.pdf</a></span></td><td>202.82 KB</td> </tr>
</tbody>
</table>
</div></div></div>Mon, 04 Nov 2013 18:32:53 +0000david7254 at http://understandinguncertainty.orghttp://understandinguncertainty.org/press-complaints-commission-decide-13000-needless-deaths-story-was-inaccurate#commentsNew content for GCSE Maths announced
http://understandinguncertainty.org/new-content-gcse-maths-announced
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Following the consultation <a href="http://understandinguncertainty.org/probability-and-stats-gcse-maths">discussed previously on this blog</a>, the Department for Education has announced the <a href="https://www.gov.uk/government/publications/gcse-mathematics-subject-content-and-assessment-objectives">revised content for GCSE Mathematics</a>.</p>
<p>Compared to the current content, the most notable changes are (a) separation of probability and statistics, (b) removal of the data-cycle, (c) increased material.</p>
<p>The proposed content for probability is as follows:</p>
<blockquote><p>
<strong>Probability</strong> </p>
<ol><li></li>
<li>record describe and analyse the frequency of outcomes of probability experiments using tables and frequency trees </li>
<li>apply ideas of randomness, fairness and equally likely events to calculate expected outcomes of multiple future experiments </li>
<li>relate relative expected frequencies to theoretical probability, using appropriate language and the 0 - 1 probability scale </li>
<li> apply the property that the probabilities of an exhaustive set of outcomes sum to one; apply the property that the probabilities of an exhaustive set of mutually exclusive events sum to one </li>
<li> understand that empirical unbiased samples tend towards theoretical probability distributions, with increasing sample size </li>
<li> enumerate sets and combinations of sets systematically, using tables, grids, Venn diagrams and tree diagrams </li>
<li> construct theoretical possibility spaces for single and combined experiments with equally likely outcomes and use these to calculate theoretical probabilities </li>
<li> calculate the probability of independent and dependent combined events, including using tree diagrams and other representations, and know the underlying assumptions </li>
<li> calculate and interpret conditional probabilities through representation using expected frequencies with two-way tables, tree diagrams and Venn diagrams. </li>
</ol></blockquote>
<p>From my personal perspective, it's good to see reference to '<em>frequency trees</em>', '<em>expected outcomes</em>' and <em>'expected frequencies</em>', since hopefully this will encourage the teaching of probability through expected frequencies. It's a shame that two <a href="http://understandinguncertainty.org/probability-and-stats-gcse-maths">suggestions in the consultation</a> were dropped: <em>'interpret risk through assigning values to outcomes (e.g. games, insurance)</em>, and <em>calculate the expected outcome of a decision and relate to long-run average outcomes.</em> But can't have everything.</p>
<p>For statistics it's </p>
<blockquote><p>
<strong>Statistics</strong> </p>
<ol><li> </li>
<li>infer properties of populations or distributions from a sample, whilst knowing the limitations of sampling </li>
<li>interpret and construct tables, charts and diagrams, including frequency tables, bar charts, pie charts and pictograms for categorical data, vertical line charts for ungrouped discrete numerical data, tables and line graphs for time series data and know their appropriate use </li>
<li> construct and interpret diagrams for grouped discrete data and continuous data, i.e. histograms with equal and unequal class intervals and cumulative frequency graphs, and know their appropriate use</li>
<li> interpret, analyse and compare the distributions of data sets from univariate empirical distributions through:<br /><br />* appropriate graphical representation involving discrete, continuous and grouped data, including box plots </li>
<p><br />* appropriate measures of central tendency (median, mean, mode and modal class) and spread (range, including consideration of outliers, quartiles and inter-quartile range)
</p><li> apply statistics to describe a population </li>
<li> use and interpret scatter graphs of bivariate data; recognise correlation and know that it does not indicate causation; draw estimated lines of best fit; make predictions; interpolate and extrapolate apparent trends whilst knowing the dangers of so doing</li>
</ol></blockquote>
<p>Compared to the consultation, box-plots and unequal-interval histograms have gone in, and fitting a straight line has come out. </p>
</div></div></div>Sat, 02 Nov 2013 11:32:52 +0000david7253 at http://understandinguncertainty.orghttp://understandinguncertainty.org/new-content-gcse-maths-announced#commentsProbability and stats feature strongly in 'Core maths' proposals for 16-18 year olds
http://understandinguncertainty.org/probability-and-stats-feature-strongly-core-maths-proposals-16-18-year-olds
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The government is pushing ahead with proposals for a maths qualification to be taken by 16-18 year-olds who got at least a grade C in Maths GCSE but are not doing maths A level.</p>
<p><a href="https://www.gov.uk/government/news/new-maths-qualifications-to-boost-numbers-studying-maths-to-age-18">Further details</a> were released on October 8th by the Department of Education, coinciding with the release of a report by the <a href="http://www.acme-uk.org/news/news-items-repository/2013/10/expert-panel-presents-guidelines-for-new-core-mathematics-qualifications-(2)">Advisory Committee on Mathematics Education (ACME)</a> from its 'expert panel on core mathematics'.</p>
<p>This <a href="http://www.acme-uk.org/media/13699/final%2007october2013,%20expert%20panel%20on%20core%20mathematics%20report.pdf">report</a> includes the 'indicative content' contained in the table below</p>
<p><img src="/sites/understandinguncertainty.org/files/core-content.png" width="627" height="429" alt="core-content.png" /></p>
<p>The importance of probability and statistics is clear. Notable aspects include focus on rough estimates, absolute and relative risk, natural frequencies, expectations, interpreting of risk statements and critiquing quantitative evidence. In fact just what we try and cover on this site!</p>
<h3>Statement of interest</h3>
<p>I am on the advisory board of the <a href="http://mei.org.uk/files/pdf/Mathematical%20_Problem_Solving_curriculum_press_release_311012.pdf">MEI project</a> to develop a problem-solving curriculum and materials for this group of students.</p>
</div></div></div>Mon, 14 Oct 2013 14:14:23 +0000david7236 at http://understandinguncertainty.orghttp://understandinguncertainty.org/probability-and-stats-feature-strongly-core-maths-proposals-16-18-year-olds#commentsSeptember 19th is Huntrodds day!
http://understandinguncertainty.org/september-19th-huntrodds-day
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>When on holiday at Whitby we took this photo of this extraordinary memorial to Mr and Mrs Huntrodds. </p>
<p>As you can read, they were both born on September 19th 1600, married on September 19th, had 12 children and then both died within 5 hours of each other on their joint 80th birthday on September 19th 1680. Now that's an impressive 'coincidence' - if it can be considered that. After all, they presumably chose to marry, and when to marry, so the really odd thing is when they died. But was there plague in Whitby in 1680? Did they have an accident? They must have been local characters, having the same birthday and being so old.</p>
<p><img src="/sites/understandinguncertainty.org/files/huntrodds_0.jpeg" width="640" height="480" alt="huntrodds_0.jpeg" /></p>
<p>A modern equivalent is the wonderful couple Joyce and Ron Pulsford who <a href="http://www.littlehamptongazette.co.uk/news/top-stories/latest/it-s-lucky-eight-for-pagham-couple-1-1492783">were both 80 on 08.08.08</a>. But they survived their birthday.</p>
<p>Of course this brings up the old question of whether people are more likely to die on their birthday, <a href="http://www.bbc.co.uk/news/world-18626157">which I have previously queried</a>. Hugh Aldersey-Williams recently pointed out this quote to me from the "17th century physician, philosopher, writer and mythbuster Sir Thomas Browne", who in his<a href="http://ebooks.adelaide.edu.au/b/browne/thomas/friend/"> 'Letter to a Friend' said</a> :</p>
<blockquote><p>Nothing is more common with Infants than to dye on the day of their Nativity, to behold the worldly Hours and but the Fractions thereof; and even to perish before their Nativity in the hidden World of the Womb, and before their good Angel is conceived to undertake them. But in Persons who out-live many Years, and when there are no less than three hundred sixty five days to determine their Lives in every Year; that the first day should make the last, that the Tail of the Snake should return into its Mouth precisely at that time, and they should wind up upon the day of their Nativity, is indeed a remarkable Coincidence, which tho Astrology hath taken witty pains to salve, yet hath it been very wary in making Predictions of it.</p></blockquote>
<p>Note the alchemical references to the <a href="http://en.wikipedia.org/wiki/Ouroboros">Ouroboros</a>. </p>
<p>So maybe the preponderance of deaths on birthdays is simply due to registrations of babies who die soon after birth? But even though Sir Thomas thought it a 'remarkable Coincidence' if an adult did die on their birthday, this is <a href="http://en.wikipedia.org/wiki/Thomas_Browne">exactly what he did </a>on 19th October 1682, his 77th birthday. And just 2 years after the Huntrodds died.</p>
</div></div></div>Sat, 21 Sep 2013 17:01:36 +0000david7210 at http://understandinguncertainty.orghttp://understandinguncertainty.org/september-19th-huntrodds-day#commentsProbability and stats in GCSE Maths
http://understandinguncertainty.org/probability-and-stats-gcse-maths
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The current consultation on <a href="https://www.education.gov.uk/consultations/index.cfm?action=consultationDetails&consultationId=1911&external=no&menu=1">GCSE subject content and assessment objectives</a> for Mathematics GCSE features major changes for probability and statistics. </p>
<p>I encourage everyone with an interest to respond (before 20th August): here is my personal take on the topic.</p>
<p>The proposals are as follows:</p>
<blockquote><h3>Probability</h3>
<ul><li></li>
<li>record and describe the frequency of outcomes of probability experiments using tables and frequency trees </li>
<li>apply ideas of randomness, fairness and equally likely events to calculate expected outcomes of multiple future experiments </li>
<li>relate relative expected frequencies to theoretical probability, using appropriate language and the 0-1 scale </li>
<li>apply the property that the probabilities of an exhaustive set of mutually exclusive outcomes sum to one </li>
<li>enumerate sets and combinations of sets systematically, using tables, grids, tree diagrams and Venn diagrams </li>
<li>construct theoretical possibility spaces for single and combined events with equally likely and mutually exclusive outcomes and use these to calculate theoretical probabilities </li>
<li>calculate the probability of independent and dependent combined events, including tree diagrams and other representations and know the underlying assumptions </li>
<li>calculate and interpret conditional probabilities through representation using two-way tables, tree diagrams, Venn diagrams and by using the formula </li>
<li>understand that empirical samples tend towards theoretical probability distributions, with increasing sample size and with lack of bias </li>
<li>interpret risk through assigning values to outcomes (e.g. games, insurance) </li>
<li>calculate the expected outcome of a decision and relate to long-run average outcomes. </li>
</ul><h3>Statistics</h3>
<ul><li></li>
<li>apply statistics to describe a population or a large data set, inferring properties of populations or distributions from a sample, whilst knowing the limitations of sampling </li>
<li>construct and interpret appropriate charts and diagrams, including bar charts, pie charts and pictograms for categorical data, and vertical line charts for ungrouped discrete numerical data </li>
<li>construct and interpret diagrams for grouped discrete data and continuous data, i.e. histograms with equal class intervals and cumulative frequency graphs </li>
<li>interpret, analyse and compare univariate empirical distributions through:
<ul><li></li>
<li>appropriate graphical representation involving discrete, continuous and grouped data </li>
<li>appropriate measures of central tendency, spread and cumulative frequency (median, mean, range, quartiles and inter-quartile range, mode and modal class) </li>
</ul></li>
<li>describe relationships in bivariate data: sketch trend lines through scatter plots; calculate lines of best fit; make predictions; interpolate and extrapolate trends.
</li>
</ul></blockquote>
<p>In addition, proposed to provide in the formulae sheet:</p>
<blockquote><h3>Probability</h3>
<p>Where $P(A)$ is the probability of outcome $A$ and $P(B)$ is the probability of outcome $B$:<br />
$$ P (A \hbox{ or } B) = P(A )+ P(B ) - P(A \hbox{ and } B )$$<br />
$$P(A \hbox{ and } B ) = P(A \hbox{ given } B ) P(B )$$
</p></blockquote>
<p>Compared to the current curriculum (shown at the bottom of this blog), the new proposals</p>
<ul><li>Split probability and statistics</li>
<li>In probability
<ul><li>Emphasises multiple representations</li>
<li>Includes additional attention to conditional probabilities</li>
<li>Includes risk and expectation</li>
</ul></li>
<li>In statistics
<ul><li>Drops histograms with unequal intervals</li>
<li>Drops ‘data cycle’ (although mentions ‘limitations of sampling’)</li>
<li>Includes calculating line of best fit</li>
</ul></li>
</ul><p>Perhaps the most controversial element is the non-inclusion of the ‘data-cycle’ (or 'statistics cycle'), of problem analysis, data collection, data presentation, data analysis. There has been a long argument within the statistics community of whether this belongs in GCSE Mathematics: the 2004 Smith Inquiry into post-14 maths education <a href="http://www.mathsinquiry.org.uk/report/">Making Mathematics Count </a> recommended </p>
<blockquote><p>The Inquiry recommends that there be a radical re-look at<br />
this issue and that much of the teaching and learning of Statistics and<br />
Data Handling would be better removed from the mathematics timetable<br />
and integrated with the teaching and learning of other disciplines (eg<br />
biology or geography). The time restored to the mathematics timetable<br />
should be used for acquiring greater mastery of core mathematical<br />
concepts and operations.</p></blockquote>
<p>Indeed, the proposed <a href="https://www.gov.uk/government/consultations/gcse-subject-content-and-assessment-objectives">Science GCSE subject content and assessment objectives</a> now includes ..</p>
<blockquote><ul><li></li>
<li>apply the cycle of collecting, presenting and analysing data, including:</li>
<ul><li></li>
<li>present observations and data using appropriate methods</li>
<li>carry out and represent mathematical and statistical analysis</li>
<li>represent random distributions of results and estimations of uncertainty</li>
<li>interpret observations and data, including identifying patterns and trends, make inferences and draw conclusions</li>
<li>present reasoned explanations including of data in relation to hypotheses</li>
<li>evaluate data</li>
<li>use an appropriate number of significant figures in calculations</li>
</ul><li>communicate the scientific rationale for investigations, methods used, findings and reasoned conclusions through written and electronic reports and presentations.</li>
</ul></blockquote>
<p>However the Royal Statistical Society's recently-commissioned <a href="http://www.rss.org.uk/site/cms/contentCategoryView.asp?category=86">Porkess Report</a> said</p>
<blockquote><ul><li></li>
<li><strong>Recommendation 5: </strong>School and college mathematics departments should ensure they have the expertise to be the authorities on statistics within their institutions. Mathematics departments should be centres of excellence for statistics, providing guidance on correct usage and good practice.</li>
<li><strong>Recommendation 6:</strong> Under present conditions, statistics is best placed in the mathematics curriculum.</li>
</ul></blockquote>
<p>Essentially the view is that if this vital element were not in Mathematics, it will either not be taught or taught badly.</p>
<p>This is tricky. My personal view is that the ‘data cycle’ is absolutely vital, but that it is better placed within understanding of the ‘scientific method’ than within core mathematics. I feel that GCSE Mathematics should provide the tools for analysis that can be used in empirical investigations, but techniques for carrying out those experiments should not be part of the assessment criteria. Obviously there is opportunity for cross-subject activity, say with Geography or Science, featuring experimental design, data-collection, analysis, presentation and interpretation of real-world numerical evidence: it is inevitably tempting to look to a different type of qualification that took a broader cross-disciplinary perspective, but we appear stuck with the rigid subject demarcations of GCSEs.</p>
<p>At A-level the link between probability and formal statistical inference can be revealed in all its glory. And if a post-16, non-A-level maths qualification is developed, then this could also include real-world investigation into the appropriate interpretation of numerical evidence.</p>
<h3>The current specification<br /></h3>
<p>This is given by the Ofqual<br /><a href="http://www2.ofqual.gov.uk/downloads/category/192-gcse-subject-criteria">GCSE Subject Criteria for Mathematics<br /></a> </p>
<blockquote><h3>Statistics and probability<br /></h3>
<ul><li></li>
<li>understand and use statistical problem solving process/handling data cycle; </li>
<li>identify possible sources of bias; </li>
<li>design an experiment or survey; </li>
<li>design data-collection sheets, distinguishing between different types of data; </li>
<li>extract data from printed tables and lists; </li>
<li>design and use two-way tables for discrete and grouped data; </li>
<li>produce charts and diagrams for various data types; </li>
<li>calculate median, mean, range, quartiles and inter-quartile range, mode and modal class; </li>
<li>interpret a wide range of graphs and diagrams and draw conclusions; </li>
<li>look at data to find patterns and exceptions; </li>
<li>recognise correlation and draw and/or use lines of best fit by eye, understanding what these represent; </li>
<li>compare distributions and make inferences; </li>
<li>understand and use the vocabulary of probability and the probability scale; </li>
<li>understand and use estimates or measures of probability from theoretical models (including equally likely outcomes), or from relative frequency; </li>
<li>list all outcomes for single events, and for two successive events, in a systematic way and derive related probabilities; </li>
<li>identify different mutually exclusive outcomes and know that the sum of the probabilities of all these outcomes is 1; </li>
<li>know when to add or multiply two probabilities: if A and B are mutually exclusive, then the probability of A or B occurring is P(A) + P(B), whereas if A and B are independent events, the probability of A and B occurring is P(A) . P(B); </li>
<li>use tree diagrams to represent outcomes of compound events, recognising when events are independent; </li>
<li>compare experimental data and theoretical probabilities; </li>
<li>understand that if they repeat an experiment, they may – and usually will – get different outcomes, and that increasing sample size generally leads to better estimates of probability and population characteristics. </li>
</ul></blockquote>
<h3>Conflict of Interest<br /></h3>
<p>I am one of the <a href="https://media.education.gov.uk/assets/files/pdf/l/lists%20of%20commentators%20-%20final.pdf">many people consulted</a> by the Department of Education </p>
</div></div></div>Sat, 03 Aug 2013 16:23:43 +0000david7159 at http://understandinguncertainty.orghttp://understandinguncertainty.org/probability-and-stats-gcse-maths#commentsFatality risk on Boris-bikes?
http://understandinguncertainty.org/fatality-risk-boris-bikes
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>I was very saddened by the <a href="http://www.bbc.co.uk/news/uk-england-london-23207691">death on Friday of a Boris-bike rider</a> in Whitechapel High Street, particularly as I am a frequent and enthusiastic user of the scheme. But as a statistician, I also immediately wondered how surprised I should be about the fact that this was the first fatality of the bikes. My conclusion, using a very rapid and crude analysis, is that it does not suggest that Boris-bikes are of higher risk than average cycling, and if anything we have been fortunate that it has taken this long for the first fatality. Of course this does not lessen the tragedy of the event.</p>
<p><a href="http://www.tfl.gov.uk/roadusers/cycling/20389.aspx ">Transport for London</a> report that between December 2010 and 31st May 2013 there were around 22,000,000 Barclays Cycle Hire (the official name) trips in London. There were 750,000 trips in May, so let’s assume that by July 7th there were around 23,000,000 trips. These journeys were an average of 20 minutes during the week and 28 minutes at the weekend, so conservatively we could assume 1.5 miles for each trip, giving a total of at least 34,000,000 miles cycled on Boris bikes since the opening of the scheme to non-members.</p>
<p>The Department of Transport reports that in 2011 there were 22 cyclist deaths per billion km (620,000,000 miles), which works out as one cycling fatality expected every 620,000,000/22 = 28,000,000 miles [see page 234 of <a href="https://www.gov.uk/government/publications/reported-road-casualties-great-britain-annual-report-2011 ">this report</a>, eventually found through the shambolic chaos of the government statistics web-links]. Of course Boris-bike users are not average: they are probably somewhat higher risk since in London and include inexperienced tourists, compensated by being lower risk by not being very old or young, and cycling extremely heavy and slow bikes. They also rarely wear cycle helmets, but I am not getting into that <a href="http://www.bmj.com/content/346/bmj.f3817?ijkey=I5vHBog6FhaaLzX&keytype=ref ">tricky area </a>.</p>
<p>If we very crudely assume these factors cancel out and Boris bike trips are of average risk, then to have a fatal accident after 34,000,000 miles is, unfortunately, not surprising. In fact, very roughly, there is perhaps less than 30% chance that it would have taken this long. </p>
<p>So I am not very surprised to hear of this tragic accident, but do feel shocked that it happened on a so-called cycle ‘superhighway’. My personal opinion, as someone who has negotiated that particular stretch of road with some trepidation, is that far more needs to be done to make cycle-friendly and protective routes in London.</p>
</div></div></div>Sun, 07 Jul 2013 08:47:55 +0000david7130 at http://understandinguncertainty.orghttp://understandinguncertainty.org/fatality-risk-boris-bikes#commentsSpeed cameras, regression-to-the-mean, and the Daily Mail (again)
http://understandinguncertainty.org/speed-cameras-regression-mean-and-daily-mail-again
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>It was interesting to hear ‘regression-to-the-mean’ being discussed on the Today programme this morning, even if the quality of the debate wasn’t great. The issue was the effectiveness of speed cameras, which tend to get installed after a spate of accidents. Since bad luck does not last, accidents tend to fall after such a ‘blip’, and this fall is generally attributed to the speed camera, whereas it would have happened anyway: this is what is meant by ‘regression-to-the-mean’.</p>
<p>The <a href="http://www.racfoundation.org/assets/rac_foundation/content/downloadables/speed_camera_data-allsop-may2013.pdf ">report from the RAC Foundation</a> tried to deal with this by essentially ignoring the 3 years before the camera was installed, and so comparing the post-installation accidents with those more than 3 years beforehand, and simultaneously allowing for overall changes in accidents over time. Unfortunately the report is not very clearly written, more discussing how to approach and analyse the (limited) data than aiming to provide definitive results. Although they helpfully provide the equations for the models being fitted, there is no executive summary and you have to search quite hard to find the crucial number flagged up for the media: the estimated 27% reduction in accidents causing fatal or severe injuries (page 32).</p>
<p>I thought the analysis seemed quite reasonable until I noticed that on page 3 it defines a baseline year as</p>
<blockquote><p>‘more than three full years before the camera was established <em>or the year during which it was established</em>’</p></blockquote>
<p>It seems very strange to include the transitional year as a baseline – surely it could just be excluded? Later on the report says that if the start-months were January or December, the year in which the camera was installed was treated as a ‘camera’ or ‘within 3-year pre-camera’ year respectively, but I am suspicious that for the remaining 10 months this could mean that some random-high accident rates could still be included in the baseline.</p>
<p>However, what is really shocking is the grossly misleading coverage of the Daily Mail, with the headline,</p>
<blockquote><p><em><br /><a href="http://www.dailymail.co.uk/news/article-2337208/Speed-cameras-increase-risk-fatal-crashes-New-RAC-investigation-raises-doubts-usefulness.html ">Speed cameras 'increase risk of serious or fatal crashes': New RAC investigation raises doubts over their usefulness</a></em>.</p></blockquote>
<p>This is a blatant mis-representation of the report and its findings, focusing solely on the 21 cameras where an increase was estimated, and ignoring the 530 where it wasn’t, as clearly shown in the table the Daily Mail so helpfully reproduce! They should be ashamed of themselves. </p>
</div></div></div>Fri, 07 Jun 2013 10:55:08 +0000david6948 at http://understandinguncertainty.orghttp://understandinguncertainty.org/speed-cameras-regression-mean-and-daily-mail-again#commentsHow can 2% become 20%?
http://understandinguncertainty.org/how-can-2-become-20
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The <a href="http://www.dailymail.co.uk/health/article-2335397/Statins-weaken-muscles-joints-Cholesterol-drug-raises-risk-problems-20-cent.html ">Daily Mail headline</a> below is unequivocal – statins cause a 20% increase in muscle problems.</p>
<p> <img src="/sites/understandinguncertainty.org/files/statin-muscles-mail.jpg" width="651" height="372" alt="statin-muscles-mail.jpg" /></p>
<p>Unfortunately, the ‘20%’ is factually incorrect - the study on which this story is based claims that taking statins increased the risk of muscle problems from 85% to 87%. And even that claim is highly dubious. How can the Daily Mail get it so wrong? </p>
<p>The ‘20%’ is a basic statistical error promoted by a misleading <a href="http://archinte.jamanetwork.com/article.aspx?articleid=1691918 ">abstract</a> and <a href="http://media.jamanetwork.com/news-item/musculoskeletal-conditions-injuries-may-be-associated-with-statin-use/ ">press release</a> from JAMA Internal Medicine – associated with the Journal of the American Medical Association, a (supposedly) reputable source. The authors estimated an ‘odds ratio’ of 1.19 for muscular-skeletal problems, which the Daily Mail interpreted as a 20% increased risk. I’m afraid we need to get a bit technical now. An odds ratio is a standard measure that statisticians and epidemiologists (yes, them again) use to measure an association between an exposure (here statins) and an event (muscle problems). It is defined as the odds of the event given the exposure, divided by the odds without the exposure. The crucial thing is the use of odds, not risk, where odds is the probability of the event divided by the probability of the event not occurring (why statisticians should use this bizarre measure is another story – see for example this <a href="http://en.wikipedia.org/wiki/Odds_ratio">Wikipedia description</a>). </p>
<p>Table 4 of the paper (not reported in the abstract) reports risks with and without statins of 87% vs 85%, which translate to odds of 0.87/0.13 = 6.7 and 0.85/0.15 = 5.7. The odds ratio is therefore 6.7/5.7 = 1.18 (their figure of 1.19 involved some adjustment for other factors). Alternatively, the risk ratio was 0.87/0.85 = 1.02, a 2% relative change, while the difference in absolute risks was 0.87 – 0.85 = 2%. The <a href="http://www.abpi.org.uk/our-work/library/guidelines/Pages/default.aspx">Code of Practice for the British Pharmaceutical Industry</a> has banned the reporting of relative risk without also giving the change in absolute risk. Why this is still considered acceptable within epidemiological papers is beyond me. </p>
<p>And such a tiny difference, in a very common problem, could be due to all sorts of confounding factors that were not allowed for. In particular, people on statins are likely to visit their doctor more, who may then investigate other symptoms, as the authors admit in their discussion. So they have not shown this difference was due to statins.</p>
<p>It is difficult to know who is most to blame here – the authors for producing a misleading abstract without the key information, JAMA Internal Medicine, or the Daily Mail. Personally, I feel that JAMA Internal Medicine is most responsible, for not properly refereeing the paper, and producing a press release that invited misunderstanding and distortion. </p>
</div></div></div>Tue, 04 Jun 2013 13:57:16 +0000david6943 at http://understandinguncertainty.orghttp://understandinguncertainty.org/how-can-2-become-20#commentsCourt of Appeal bans Bayesian probability (and Sherlock Holmes)
http://understandinguncertainty.org/court-appeal-bans-bayesian-probability-and-sherlock-holmes
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><blockquote><p>..when you have eliminated the impossible, whatever remains, however improbable, must be the truth<br />
(Sherlock Holmes in The Sign of the Four, ch. 6, 1890)</p></blockquote>
<p>In a <a href="http://www.bailii.org/ew/cases/EWCA/Civ/2013/15.html">recent judgement </a>the English Court of Appeal has not only rejected the Sherlock Holmes doctrine shown above, but also denied that probability can be used as an expression of uncertainty for events that have either happened or not.</p>
<p>The case was a civil dispute about the cause of a fire, and concerned an appeal against a decision in the High Court by Judge Edwards-Stuart. Edwards-Stuart had essentially concluded that the fire had been started by a discarded cigarette, even though this seemed an unlikely event in itself, because the other two explanations were even more implausible. The Court of Appeal rejected this approach although still supported the overall judgement and disallowed the appeal - commentaries on this case have appeared <a href="http://www.lexology.com/library/detail.aspx?g=471d7904-20d2-4fdb-a061-8d49b10de60d&l=7HVPC65"> here </a> and <a href="http://www.12kbw.co.uk/cases/commentary/id/161/">here</a>.</p>
<p>But it's the quotations from the judgement that are so interesting:</p>
<blockquote><p>Sometimes the "balance of probability" standard is expressed mathematically as "50 + % probability", but this can carry with it a danger of pseudo-mathematics, as the argument in this case demonstrated. When judging whether a case for believing that an event was caused in a particular way is stronger that the case for not so believing, the process is not scientific (although it may obviously include evaluation of scientific evidence) and to express the probability of some event having happened in percentage terms is illusory.
</p></blockquote>
<p>The idea that you can assign probabilities to events that have already occurred, but where we are ignorant of the result, forms the basis for the Bayesian view of probability. Put very broadly, the 'classical' view of probability is in terms of genuine unpredictability about future events, popularly known as 'chance' or 'aleatory uncertainty'. The Bayesian interpretation allows probability also to be used to express our uncertainty due to our ignorance, known as 'epistemic uncertainty', and popularly expressed as betting odds. Of course there are all gradations, from pure chance (think radioactive decay) to processes assumed to be pure chance (lottery draws), to future events whose odds depend on a mixture of genuine unpredictability and ignorance of the facts (whether Oscar Pistorius will be convicted of murder), to pure epistemic uncertainty (whether Oscar Pistorius knowingly shot his girlfriend).</p>
<p>The judges went on to say:</p>
<blockquote><p>The chances of something happening in the future may be expressed in terms of percentage. Epidemiological evidence may enable doctors to say that on average smokers increase their risk of lung cancer by X%. But you cannot properly say that there is a 25 per cent chance that something has happened: Hotson v East Berkshire Health Authority [1987] AC 750. Either it has or it has not.
</p></blockquote>
<p>So according to this judgement, it would apparently not be reasonable in a court to talk about the probability of Kate and William's baby being a girl, since that is already decided as true or false (but see note added below). This seems extraordinary.</p>
<p>Part of the problem may be the judges' use of the word 'chance' to describe epistemic uncertainty about whether something has happened or not - this would be unusual usage now (even though Thomas Bayes used 'chance' in this sense). If they had used the term 'probability' perhaps their quote above would seem more clearly unreasonable. </p>
<p>Anyway, I teach the Bayesian approach to post-graduate students attending my 'Applied Bayesian Statistics' course at Cambridge, and so I must now tell them that the entire philosophy behind their course has been declared illegal in the Court of Appeal. I hope they don't mind.</p>
<p>(Note added 1st March 2013: <a href="http://sports.williamhill.com/bet/en-gb/betting/e/2586242/Name%2dof%2dWilliam%2d%26%2dKate%2ds%2dfirst%2dbaby.html">William Hill </a> are currently offering 1000-1 against <em>Chardonnay</em> as the name of the potential future monarch).</p>
</div></div></div>Mon, 25 Feb 2013 09:26:09 +0000david6817 at http://understandinguncertainty.orghttp://understandinguncertainty.org/court-appeal-bans-bayesian-probability-and-sherlock-holmes#commentsWhat's more dangerous - the bute or the burger?
http://understandinguncertainty.org/whats-more-dangerous-bute-or-burger
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>There is reasonable public outrage at possible criminal conspiracies to adulterate meat products with horsemeat, and additional concerns raised about the presence of the anti-inflammatory known as bute.</p>
<p>While not in any way questioning this concern about adulteration with a chemical compound, it is helpful to get a sense of magnitude. When bute was given as a human medicine, it was reported to be associated with a serious adverse reaction in 1 in 30,000 (over a whole course of treatment), but at a dose giving concentrations at least 4,000 times that arising from eating a diet of horse meat - see the excellent information from the <a href="http://www.sciencemediacentre.org/expert-reaction-to-the-continuing-horsemeat-story-and-bute-phenylbutazone/ ">Science Media Centre</a> </p>
<p>So making all sorts of heroic assumptions about there being a linear-no-threshold response, we might very roughly assign a pro-rata risk of a serious event as 1 in 100,000,000 per burger.</p>
<p>Compare that with the risk from the meat itself. There is good evidence that red meat consumption is associated with an<a href="http://www.nhs.uk/Livewell/Goodfood/Pages/red-meat.aspx"> increased risk of bowel cancer</a>, and specifically a <a href="http://archinte.jamanetwork.com/article.aspx?articleid=1134845">large recent study from Harvard</a> associated a daily habit of 80g (3.5 oz) of red meat with an increased all-cause mortality rate of 13% - I recently showed in this <a href="http://www.natap.org/2012/newsUpdates/bmj.e8223.pdf">British Medical Journal paper </a>that this was as if, pro-rata, each portion of red meat was associated with ½ hour loss in life-expectancy, around 1,000,000th of a young-adult’s future life.</p>
<p>So my rough guess is that for a burger made out of horse-meat containing bute - or indeed any kind of red meat - the burger itself carries around 100 times the apparent risk of the bute. Even taking into account that the bute reaction would occur quicker than any harm from the red meat, this still is a notable disparity.</p>
<p>Of course I know very well that people, including myself, feel very differently about risks that are chosen as part of daily life, and appear ‘natural’, to those imposed by outside (probably criminal) agencies and involve unnatural substances. I fully respect those feelings, but I still believe some perspective is valuable.</p>
</div></div></div>Fri, 15 Feb 2013 08:28:12 +0000david6811 at http://understandinguncertainty.orghttp://understandinguncertainty.org/whats-more-dangerous-bute-or-burger#commentsSquaring the square, in glass
http://understandinguncertainty.org/squaring-square-glass
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Here is my latest stained glass effort, seen on a snowy day. </p>
<p><img src="/sites/understandinguncertainty.org/files/trinity-glass2-small.jpg" width="600" height="450" alt="trinity-glass2-small.jpg" /></p>
<p>It is a 'square of squares', where all the constituent squares are of different sizes. Here are the dimensions - </p>
<p><img src="/sites/understandinguncertainty.org/files/sqsqbig.png" width="400" height="400" alt="sqsqbig.png" /></p>
<p>It is copied from the logo of the <a href="https://www.srcf.ucam.org/tms/about-the-tms/the-squared-square/#2">Trinity Mathematical Society</a>, who point out that it is the <em>unique smallest simple squared square (smallest in that it uses the fewest squares, and simple in that no proper subset of the squares of size at least 2 forms a rectangle).</em> It was proved to be the smallest such square by Duijvestijn in 1978, but this was by exhaustive computer search, which seems a bit like cheating.</p>
<p>There is a fine <a href="http://en.wikipedia.org/wiki/Squaring_the_square">Wikipedia site </a>which contains more than you ever wish to know about squaring-the-square.</p>
<h2>Challenge</h2>
<p>I wanted to only use 4 colours without any square touching another of the same colour, and of course I knew this is possible due to the 4-colour theorem. But I wanted the four large outer squares to be 'white' (in order to increase the Mondrian appeal). It took some effort and trial-and-error to find a 4-colouring with this property. Are there others?</p>
</div></div></div>Tue, 22 Jan 2013 18:24:28 +0000david6789 at http://understandinguncertainty.orghttp://understandinguncertainty.org/squaring-square-glass#commentsAlcohol in pregnancy and IQ of children
http://understandinguncertainty.org/alcohol-pregnancy-and-iq-children
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Some of the coverage of yesterday's story about drinking in pregnancy and IQ of children was not entirely accurate. The Times reported that <em>'women who drink even a couple of glasses of wine a week during pregnancy are risking a two-point drop in their child's IQ</em>', and '<em>children whose mothers drank between 1 and 6 units a week - up to three large glasses of wine - had IQs about two points lower</em> '(than mothers who did not drink). </p>
<p>But let's look at Table 3 of the paper, which is available <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0049407">here</a>.</p>
<p><img src="/sites/understandinguncertainty.org/files/alcohol-IQ-table3.jpg" width="700" height="305" alt="alcohol-IQ-table3.jpg" /></p>
<p>Here we see that children of mothers that drank had systematically higher IQs than those who didn't! But nothing can be inferred from this (except that the Times was not exactly correct).</p>
<p>This of course is the whole problem of carrying out these studies: there are many 'confounding' factors that are associated both with mothers' drinking habits and children's IQ, and this makes teasing out the underlying relationship very tricky.</p>
<p>This is where the ingenious idea of 'Mendelian Randomisation' comes in. Genes are assumed independent of confounding actors, so it is as if women have been randomly allocated to the genetic groups in the Table, and so they should be balanced for all other factors. The genes are seen to be associated with the IQ of offspring among women who drink, and the genes were selected as those that regulate uptake of alcohol, and babies who may be exposed to greater alcohol have on average lower IQs. But maybe the genes affect babies in other ways than in uptake of alcohol? But this is deemed implausible, as there is no relationship seen in non-drinkers. </p>
<p>This is a very clever and careful study, to be taken very seriously. But it does not allow an estimate of the effect of drinking, and the authors are careful not to give one (even if they appear to have been rather happy to declare public health implications, which seems to be somewhat overstepping their role as epidemiologists).</p>
<p>Very crudely, if we took the lowest group as similar to non-drinkers, then the effect of moderate drinking might be estimated as around 2 points, but nobody would want to put this in an academic publication.</p>
<p>In addition, an important issue is the alcohol quantities. The 'drinkers' include those reporting between 1 and 6 units a week - quite a range - and we can also assume that this is an understatement of true consumption. So what 'moderate' drinking actually means is open to some question.</p>
<p>As usual, the NHS Behind the Headlines site had an<a href="http://www.nhs.uk/news/2012/11November/Pages/Just-one-glass-of-wine-a-week-in-pregnancy-damages-childs-IQ.aspx"> excellent discussion.</a></p>
<p><em>Added later: I have made a few edits in this blog to make it clearer that I am not questioning the study's basic conclusions, just pointing out that some of the coverage misunderstood the (somewhat subtle) findings</em></p>
</div></div></div>Fri, 16 Nov 2012 17:54:04 +0000david6681 at http://understandinguncertainty.orghttp://understandinguncertainty.org/alcohol-pregnancy-and-iq-children#commentsMore lessons from L'Aquila
http://understandinguncertainty.org/more-lessons-laquila
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The L’Aquila story gets even murkier.</p>
<h2>Scientists duty</h2>
<p>Additional reports suggest complicity to manipulate public opinion. See, for example, this <a href="http://www.repubblica.it/cronaca/2012/10/25/news/terremoto_aquila_intercettazioni-45259736/?ref=HRER1-1">article in La Repubblica</a> (in Italian) with the headline quote '<em>the truth cannot be said</em>' taken from a tapped telephone call between the head of the Civil Protection Agency and one of the scientists. The article claims that the misleading statement on which the trial hinged - that the many small shocks reduced rather than increased the risk - had already been decided by officials and the scientists were simply part of a 'media operation'. </p>
<p>So, although it sounds a bit obvious, an extra lesson I draw is</p>
<ul><li>Scientific advisors owe a duty to society as a whole, must retain their independence, and should carefully avoid ‘going native’ and becoming complicit in the objectives of the agency that has requested their services.
</li>
</ul><h2>Communicating the chances of low-probability high-impact events<br /></h2>
<p>The trial rests largely on the claim in the press conference that the swarm of small shocks reduced the risks of a large earthquake. Assuming this is not the case, and the risk was in fact increased over the normal levels, it raises the vital issue of communicating low-absolute risk but high relative risk probabilities. The general literature on risk communication advises against the sole use of relative risk, since these are known to give an exaggerated impression of magnitude (twice ‘very small’ is still generally ‘very small’). Thomas Jordan, director of the Southern California Earthquake Center at the University for Southern California, chairs the International Commission on Earthquake Forecasting (ICEF), which wrote <a href="http://www.annalsofgeophysics.eu/index.php/annals/article/view/5350">a report </a> following the L’Aquila quake. He argues strongly that a time-series of probabilities should be provided in public communication – you can listen to Jordan being interviewed by <a href="http://www.radio3.rai.it/dl/radio3/programmi/puntata/ContentItem-e07a4299-9679-48ea-831b-5d460bc43f79.html">Italian public radio</a> (RAI) and the <a href="http://www.bbc.co.uk/programmes/p00zc0d5"> BBC World Service</a>. So my next lesson is</p>
<ul><li>When communicating the chances of low-probability high-impact events, provide estimates of absolute risks. However these need to be put in context, preferably by relating to levels of risk at other times.</li>
</ul><p>People deserve to know that the risk has increased, even if it is still low in an absolute sense (as it always will be for earthquakes), so that they can apply their own thresholds for caution.</p>
<h2>Indemnity</h2>
<p>In my previous <a href="http://understandinguncertainty.org/continuing-tragedy-l’aquila">blog</a>, I mentioned about the need to acquire indemnity against civil actions. The <a href="http://www.bis.gov.uk/assets/goscience/docs/c/11-1382-code-of-practice-scientific-advisory-committees.pdf">UK Government Chief Scientific Advisor's Code of Practice for Scientific Advice</a> says this should be available to scientific advisors: under "<em>Liabilities and indemnity of members</em>" it says</p>
<blockquote><p>"The Cabinet Office Model Code of Practice for Board Members of Advisory Non-Departmental Public Bodies (page 6) states that: “Legal proceedings by a third party against individual board members of advisory bodies are very exceptional. A board member may be personally liable if he or she makes a fraudulent or negligent statement which results in a loss to a third party; or may commit a breach of confidence under common law or criminal offence under insider dealing legislation, if he or she misuses information gained through their position. However, the Government has indicated that individual board members who have acted honestly, reasonably, in good faith and without negligence will not have to meet out of their own personal resources any personal civil liability which is incurred in execution or purported execution of their board functions. Board members who need further advice should consult the sponsor department.”</p>
<p>This should already be the position for existing advisory NDPBs. For newly established committees and for non-NDPBs, secretariats should liaise with their sponsoring department’s Public Bodies Team or Human Resources Team to ensure that an appropriate indemnity for members is in place."</p></blockquote>
<p>I interpret this is saying that, for a broad range of advisory committees, sponsoring departments should ensure an appropriate indemnity scheme is in place. The code applies very widely:</p>
<blockquote><p>"The Code was developed to apply to advisory committees providing independent scientific advice, regardless of their specific structure and lines of accountability; whether reporting to a Ministerial Department, Non-Ministerial Department or other public body, and whether an advisory NDPB or an expert scientific committee."</p></blockquote>
<h2>Twitter</h2>
<p>I also warned of the dangers of using social media in delicate situations. Subsequently a rather casual tweet of mine found its way onto a <a href="http://edition.cnn.com/2012/10/23/world/europe/italy-quake-scientists-guilty/index.html">CNN News report</a> and into Italian national media, bringing critical comments from Italian colleagues. I should listen to my own advice.</p>
</div></div></div>Sun, 28 Oct 2012 19:18:45 +0000david6653 at http://understandinguncertainty.orghttp://understandinguncertainty.org/more-lessons-laquila#commentsThe Continuing Tragedy of L’Aquila
http://understandinguncertainty.org/continuing-tragedy-l%E2%80%99aquila
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p> As in ‘<a href="http://www.thesun.co.uk/sol/homepage/irishsun/irishsunnews/4603979/Boffins-jailed-for-not-predicting-killer-earthquake.html ">Boffins jailed for not predicting earthquake</a>’, the 6-year sentences and massive fines handed out to the Italian seismologists have been largely portrayed by the media and commentators outside Italy as an attack on science, and the prosecution ridiculed as expecting the scientists to have been able to predict the earthquake. </p>
<p>However, many have pointed out that it is all a bit more complicated than that. See, for example, <a href="http://www.nature.com/news/2011/110914/full/477264a.html ">a detailed article in Nature</a> and these blogs by <a href="http://rogerpielkejr.blogspot.co.uk/2012/10/mischaracterizations-of-laquila-lawsuit.html?utm_source=dlvr.it&utm_medium=twitter ">Roger Pielke</a> and <a href="http://tremblingearth.wordpress.com/2012/10/23/conviction-of-italian-seismologists-a-nuanced-warning/ ">Austin Elliott</a>. </p>
<p>Briefly, the seismologists appear to have agreed to attend a hasty meeting that had, possibly unknown to them, been set up by local officials with the express intention of playing down local fears. The scientists concluded that ‘they could not be confident there would be an earthquake’, which was subsequently communicated in an informal press conference as ‘confident there would not be an earthquake’, which in the eyes of some locals rendered them culpable after the subsequent events. Essentially, the seismologists appear to have been manipulated by local interests, and are now paying a ludicrous price.</p>
<p>In spite of Sir John Beddington (Government Chief Scientific Advisor) assuring us that this type of prosecution would not happen in the UK, this should be a strong warning to any scientist asked for their opinion about matters of strong public interest, as Willy Aspinall lays out in this excellent <a href="http://www.nature.com/news/2011/110914/pdf/477251a.pdf ">commentary in Nature</a>. </p>
<p>The lessons I am personally trying to learn are - </p>
<p>1. Never to give advice unless I am confident that the findings will be communicated either by myself or a trusted professional source, using a pre-determined plan and appropriate, carefully chosen language that acknowledges uncertainty and does not either prematurely reassure or induce unreasonable concern.</p>
<p>2. Not to engage in informal communication using social media on that issue.</p>
<p>3. Ensure proper indemnity arrangements are in place. Apparently this is true for official government advisors, but in my experience I have found that establishing advisors' legal position was not a high priority for the people asking for advice. And indemnity could not be taken for granted when advising agencies such as NHS Trusts (not being an NHS employee). Of course, even in the UK one would not be covered for criminal prosecutions such as the one on Italy.</p>
<p>The earthquake threat will always be there in many parts of Italy, and this court case has only added to the woes of the Italian public by distracting attention from lax building standards. And who in Italy will want to choose seismology as a career now?</p>
<p><em>Added as an afterthought<br /></em><br />
There is, of course, a danger of 'defensive science', and an unwillingness to engage with important public issues. But I believe the lessons listed above should be standard professional practice, and do not represent an over-cautious approach. It would be an extra tragedy if L'Aquila led to a general reluctance to provide scientific advice.</p>
</div></div></div>Wed, 24 Oct 2012 07:42:50 +0000david6636 at http://understandinguncertainty.orghttp://understandinguncertainty.org/continuing-tragedy-l%E2%80%99aquila#commentsRats and GM
http://understandinguncertainty.org/rats-and-gm
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>With others, I made some <a href="http://www.sciencemediacentre.org/pages/press_releases/12-09-19_gm_maize_rats_tumours.htm">comments for the press </a>about the recent paper (abstract, figures and tables freely available <a href="http://www.sciencedirect.com/science/article/pii/S0278691512005637 ">here</a>) on cancer in rats fed GM maize and Monsanto's Roundup pesticide.<br />
[ Full paper should also be available <a href="http://research.sustainablefoodtrust.org/wp-content/uploads/2012/09/Final-Paper.pdf">here</a>].</p>
<p>Whatever the truth about GMOs, this is not a great contribution to the debate. The paper is not well written, to say the least, with phrases such as “In females, all treated groups died 2–3 times more than controls, and more rapidly” in the abstract. The Methods section gives a whole lot of detail about some complex secondary method, but nothing on the analysis of the primary outcome data, presumably tumour incidence over time. </p>
<p>If we assume the experiment was carried out appropriately, the crucial flaw was only having 20 control rats, 10 in each group, so that it is (predictably) almost impossible to show statistically significant differences, since the control rats would have been expected to develop tumours too. In fact no formal statistical tests are carried out, and one does not have to do much maths to understand that statements about ‘30% of male control rats’ actually mean ‘3 out of 10’.</p>
<p>If you can’t download the full version, the figures and tables are available, so you can see the “survival” plots with no labeling of the curves or statistical comparison. Figure 1 actually shows that the highest dose male rats seem to have done even better than the controls, but then this difference would not be statistically significant either. The gruesome pictures only show treated rats, but the majority of the 20 control rats got tumours too, as apparently this strain is particularly prone to them.</p>
<p>The <a href="http://www.dailymail.co.uk/sciencetech/article-2205509/Cancer-row-GM-foods-French-study-claims-did-THIS-rats--cause-organ-damage-early-death-humans.html?ITO=1490 ">Daily Mail’s coverage</a> was what you would expect given their old stand on 'Franken-foods', misleadingly quoting Michael Antoniou as if he were independent when he was part of the campaigning organisation CRIIGEN (established by the lead author Seralini) that ran the trials and even helped to write the paper. They also claim the paper was “peer reviewed by independent scientists to guarantee the experiments were properly conducted and the results are valid”, when in this case it is clear that this never went near a decent statistical reviewer. But this is hardly the Daily Mail's fault.</p>
<p>I am grateful for the authors for publishing this paper, as it provides a fine case study for teaching a statistics class about poor design, analysis and reporting. I shall start using it immediately.</p>
</div></div></div>Thu, 20 Sep 2012 06:55:13 +0000david6488 at http://understandinguncertainty.orghttp://understandinguncertainty.org/rats-and-gm#comments