david's blog
http://understandinguncertainty.org/davidsblog
enIs prostitution really worth £5.7 billion a year?
http://understandinguncertainty.org/prostitution-really-worth-%C2%A357-billion-year
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The EU has demanded rapid payment of £1.7 billion from the UK because our economy has done better than predicted, and some of this is due to the prostitution market now being considered as part of our National Accounts and contributing an <a href="http://www.telegraph.co.uk/news/worldnews/europe/eu/11184605/Explainer-Why-must-Britain-pay-1.7bn-to-the-European-Union-and-can-we-stop-it-happening.html">extra £5.3 billion to GDP at 2009 prices</a>, which is 0.35% of GDP, half that of agriculture. But is this a reasonable estimate?</p>
<p>This £5.3 billion figure was assessed by the <a href="http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-360136">Office of National Statistics in May 2014 </a>based on the following assumptions, derived from <a href="http://www.ons.gov.uk/ons/rel/naa1-rd/national-accounts-articles/inclusion-of-illegal-drugs-and-prostitution-in-the-uk-national-accounts/index.html">this analysis</a>. To quote the ONS:</p>
<ul><li>Number of prostitutes in UK: 61,000</li>
<li>Average cost per visit: £67</li>
<li>Clients per prostitute per week: 25</li>
<li>Number of weeks worked per year: 52</li>
</ul><p>Multiply these up and you get £5.3 billion at 2009 prices, around £5.7 billion now.</p>
<p>This assessment has been severely questioned. Dr Brooke Magnanti, aka Belle de Jour, reckoned <a href="http://www.telegraph.co.uk/women/sex/10864898/Prostitution-adds-5bn-a-year-to-UK-economy.-Are-you-having-a-laugh.html">it might be ten times too high</a>. In contrast others have said <a href="http://blog.import.io/?author=52e28dc1e4b0377cec9ac9d0">it should be £9 billion as it ignores male prostitution</a>. Jolyon on <a href="http://www.taxrelief4escorts.co.uk/2014/06/01/does-prostitution-really-contribute-5-3bn-to-uk-gdp/">Tax Relief 4 Escorts</a>, who claims a maths degree from Cambridge, has done a detailed critique. He points out the flaws in the survey on which the 61,000 is based, and claims the assumed workload is too high and that the cost per visit (which the ONS based on <a href="http://www.punternet.com/index.php">PunterNet</a>) seems too low: it is somewhat ironic that the ONS use an information source that a previous minister, Harriet Harman, <a href="http://www.independent.co.uk/news/uk/home-news/punter-net-prostitutes-thank-harriet-harman-for-publicity-boost-1796759.html"> tried to shut down</a>.</p>
<p>My feeling is that the assumption that has the most problems is the workload. ONS are suggesting that the average person who works in prostitution has around 1,250 clients a year. This is based on Dutch experience, whereas the pattern of working in the UK is likely to be very different, with a complex industry comprising street-walkers, escorts, the informal market, those who work from fixed premises and 'independents' who advertise, for example, on <a href="http://www.adultwork.com">AdultWork</a>. Many are part-time. </p>
<p>As always, it's best to do a simple reality check. The ONS assumptions come to around 75,000,000 visits a year. Let's say 60,000,000 are from locals rather than foreign visitors, which is more than a million a week. There are around 20,000,000 men between 18 and 65 in the UK (taking an arbitrary upper limit), so this would mean that on average each of them buys sex three times a year. In fact the latest Natsal survey found that <a href="http://www.thelancet.com/journals/lancet/article/PIIS0140673613620358/table?tableid=tbl2&tableidtype=table_id&sectionType=red">around 4% of men between 18 and 65 reported paying for sex in the last 5 years</a>, that's about 800,000 men. If there were really more than a million visits a week, then the average man who paid for sex at any time in the last 5 years, did so considerably more often than once a week. In fact the proportion who pay for sex each year will probably be less than 2%, which means that less than 400,000 men are taking up over a million visits each week - that's around once every 3 days for each of the 400,000. I am no expert on the behaviour of this subgroup, but this does seem rather high, to say the least: a <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2563847/">study of men who pay for sex in Scotland</a> found a mean of only 5 partners in a year.</p>
<p>The assumptions also mean that the average person working in prostitution is turning over nearly £100,000 a year, which Jolyon from Tax Relief 4 Escorts says is completely implausible, and he should know.</p>
<p>Although this is a big statistical challenge, such an important contribution to the economy deserves a more robust analysis. When better figures come out I predict the UK will be due a substantial rebate. But that won't help David Cameron now.</p>
<p><em>27th October: Some figures have been revised since first posting, but the gist stays the same.</em></p>
</div></div></div>Sat, 25 Oct 2014 16:20:40 +0000david7853 at http://understandinguncertainty.orghttp://understandinguncertainty.org/prostitution-really-worth-%C2%A357-billion-year#commentsWhy 'life expectancy' is a misleading summary of survival
http://understandinguncertainty.org/why-life-expectancy-misleading-summary-survival
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>It's well-known how misleading it can be to use average (mean) as a summary measure of income: the distribution is very skew, and a few very rich people can hopelessly distort the mean. So median (the value halfway along the distribution) income is generally used, and this might fairly be described as the income of an <em>average person</em>, rather than the <em>average income</em>. </p>
<p>But, like everyone else dealing with actuarial statistics, I use life expectancy (the mean number of future years) to communicate someone's survival prospects. And yet, just as for income, it is also a poor measure due to the skewness of the distribution of survival.</p>
<p>This can be clearly shown by looking at the <a href="http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-325699 ">life tables published by the Office for National Statistics (ONS) </a>: these have a convenient column labeled $d_x$, which is the probability density for survival, expressed as the expected number of deaths at each age out of 100,000 births, assuming the current mortality rates continue. The density plots for women and men are shown below, using the life tables for 2010-2012. The distributions have a small peak for babies dying in the first year of life, and then a long left-tail for early deaths, and then a sharp peak and a rapid fall up to age 100. The ‘compression’ of mortality is clear. </p>
<div class="captionCentre"><img src="/sites/understandinguncertainty.org/files/density-female.png" width="566" height="342" alt="density-female.png" /><p class="caption">Numbers of women expected to die at each age, out of 100,000 born, assuming mortality rates stay the same as 2010-2012. The expectation is 83, median 86, the most likely value (mode) is 90. </p>
</div>
<div class="captionCentre"><img src="/sites/understandinguncertainty.org/files/density-men.png" width="566" height="342" alt="density-men.png" /><br /><p class="caption">Numbers of men expected to die at each age, out of 100,000 born, assuming mortality rates stay the same as 2010-2012. The expectation is 79, median 82, the most likely value (mode) is 86.</p>
</div>
<p>Left-skewed distributions are rather unusual, but have similar issues as any skew distribution - the mean, median and mode can be very different. For these survival distributions it is perhaps remarkable how far the mode is from the mean: for girls born now, even assuming there are no more increases in survival, their most likely age to die is 90, seven years more than the mean on 83. For little baby boys the mode is 86, again seven years more than the mean of 79. And even the median is 3 years more than the mean. That's why I now believe that 'life expectancy' is misleading.</p>
<p>Of course these ‘period life tables’ unrealistically assume mortality will stay the same in the future, whereas life expectancy has been growing at around 3 months a year for decades, corresponding to the annual risk of death reducing at about 2% per year. The ONS also provide <a href="http://www.ons.gov.uk/ons/rel/lifetables/historic-and-projected-data-from-the-period-and-cohort-life-tables/2012-based/stb-2012-based.html ">‘cohort life tables’</a> that make various projections about whether these trends will continue in the future: the 'central projection' says girls born now have a life expectancy of 94, with (according to my rough calculations) a median and mode of around 100, and men have a life expectancy of 91, with a median and mode of around 96. Under the ‘high' projections, with the possibly implausible assumption that the increases continue at the same rate in the future, children born today will on average live more than 100 years. Good luck to them - heaven knows how long they will have to work for.</p>
</div></div></div>Mon, 22 Sep 2014 19:14:52 +0000david7792 at http://understandinguncertainty.orghttp://understandinguncertainty.org/why-life-expectancy-misleading-summary-survival#commentsUsing expected frequencies when teaching probability
http://understandinguncertainty.org/using-expected-frequencies-when-teaching-probability
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The July 2014 <a href="https://www.gov.uk/government/publications/national-curriculum-in-england-mathematics-programmes-of-study">Mathematics Programmes of Study: Key Stage 4 </a>(GCSE) specifies under <em>Probability</em> </p>
<blockquote><p><strong>{calculate and interpret conditional probabilities through representation using expected frequencies with two-way tables, tree diagrams and Venn diagrams}.<br /></strong></p></blockquote>
<p>- the brackets and bold case means this comes under <em>additional mathematical content to be taught to more highly attaining pupils</em>. </p>
<p>The use of the term ‘expected frequencies’ is novel and not widely known in mathematics education. The basic idea is very simple: instead of saying “<em>the probability of X is 0.20 (or 20%)</em>”, we would say “<em>out of 100 situations like this, we would expect X to occur 20 times</em>”.</p>
<p>‘<em>Is that all?</em>’ I hear you cry. But this simple re-expression can have a deep impact. The idea is strongly based on research in risk communication, in particular the work of Gerd Gigerenzer and others who use the term ‘natural frequencies’. Extensive research (see selected references at the bottom) have shown this representation can prevent confusion and make probability calculations easier and more intuitive.</p>
<p>The first point is that it helps clarify what the probability means. When we hear the phrase ‘<em>the probability it will rain tomorrow is 30%</em>’, what do we mean? That it will rain 30% of the time? Over 30% of the area? In fact it means that out of 100 such computer forecasts, we can expect it to rain after 30 of them. By clearly stating what the ‘denominator’ is, ambiguity is avoided. It has been shown that by using expected frequencies, people find it easier to carry out non-intuitive conditional probability calculations.</p>
<p>Expected frequency is the standard format taught to medical students for risk communication, and is used extensively in public dialogue. Examples include the QRISK program and the current leaflets for breast cancer screening.</p>
<div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/qrisk.png" width="507" height="261" alt="qrisk.png" /><p class="caption">Output from the QRISK program using expected frequencies – the most widely used tool in general practice for assessing and communicating cardiovascular risk
</p>
</div>
<div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/breast-screening.png" width="334" height="402" alt="breast-screening.png" /><p class="caption">An image from the current breast screening information leaflet from the NHS Screening Programme, showing the use of expected frequencies to communicate the chances of different events subsequent to a mammogram
</p>
</div>
<p>In teaching probability, expected frequencies can be used in their own right, or as a tool for doing more complex probability calculations. Perhaps the ideal representation is using ‘icon arrays’, as in the QRISK example, but these cannot be drawn by students and are inappropriate for small probabilities. Therefore tree representations are appropriate, although as noted in the Programme of Study, two-way tables and Venn diagrams can also be used and will be illustrated below . They can be introduced gradually, possibly using the framework shown below, in which some sample questions and a fewe solutions are provided.</p>
<h3>1. Basic probability. </h3>
<p>This is essentially a one level tree. Questions can involve going either from probabilities (expressed as decimals, fractions and %’s) to expected frequencies, or vice versa. The problems can be drawn as either expected frequency or probability trees, as shown for the following questions. The actual questions could be provided in different ways, for example with some entries in a tree provided and the student asked to complete the tree.</p>
<p><em>Going from probability to expected frequency<br /></em></p>
<ul><li>Some balanced dice have probability 1/6 of coming up ‘4’. Out of 60 throws, how many ‘4’s would we expect to come up? </li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/dice.png" width="605" height="230" alt="dice.png" /><p class="caption">Probability and frequency trees for dice </p>
</div>
<p>.</p>
<ul><li>80% of the school students can roll their tongues. If I pick 1000 students at random, how many do you expect will NOT be able to roll their tongues? </li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/tongue.png" width="605" height="204" alt="tongue.png" /><p class="caption">Equivalent probability and frequency trees for tongue-rolling</p>
</div>
<p>.</p>
<ul><li>There is a 0.02 probability of winning some prize with a National Lottery ticket. If I buy a ticket a week for a year, about how many winning tickets do I expect to get?</li>
<li>A doctor tells your uncle he has a 15% chance of a heart attack in the next 10 years. Out of 100 men like your uncle, how many would you expect to have a heart attack in the next 10 years?</li>
</ul><p><em>Going from expected frequencies to probabilities.<br /></em></p>
<p>In this case we need to make clear that a single case is representative of group. </p>
<ul><li>In Dumpsville, in past years it has typically rained on 6 days in June (which has 30 days). Assuming the climate has not changed, if I plan to visit Dumpsville next June, what is the probability the day will be dry?</li>
<li>Experience has shown out of every 100 racing cyclists, 20 will have been doping. If I pick a cyclist at random, what is the probability that he will be ‘clean’ (not doping)?</li>
<li>In a typical school with 80 Year 10 students, 64 of them will have a profile on the social media site Face-ache. What is the probability that if we pick a Year 10 student at random, they will not have a profile?</li>
</ul><h3>2. Comparisons of probabilities.<br /></h3>
<p>This involves comparison of two different situations, and can be represented using a pair of trees. It is ideal for dealing with challenging and realistic questions concerning relative and absolute risks. </p>
<p><em>Probabilities to expected frequencies<br /></em></p>
<ul><li>If I buy a ticket in Super Lottery, there is a 1% chance of winning something, while a ticket in the Duper Lottery has a 3% chance of winning a prize. If I intend to buy 100 tickets, how many more times will I win if I buy Duper tickets rather than Super tickets?</li>
<li>A newspaper headline says that eating radishes doubles your chance of getting Smith’s Disease. 1% of people who don’t eat radishes get Smith’s Disease anyway.
<ul><li>Out of 200 people not eating radishes, how many would I expect to get Smith’s disease? </li>
<li>Out of 200 people eating radishes, how many would I expect to get Smith’s disease? </li>
<li>How many people have to eat radishes, in order to get one extra case of Smith’s disease?</li>
</ul></li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/radishes.png" width="605" height="443" alt="radishes.png" /><p class="caption">Probability and expected frequency trees for people who eat and do not eat radishes
</p>
</div>
<p><em>Expected frequencies to probabilities<br /></em></p>
<ul><li>Typically it rains on 6 days in June (30 days). I am told that in September there is double the chance of raining on any day. What is the chance that it will rain on a random day in September? </li>
</ul><h3>3. Conditional and marginal probabilities. </h3>
<p>This requires two-level trees, and can also bring in two-way tables and Venn diagrams. First, give the conditional probabilities, set up the expected frequency tree, then can calculate the marginal expected frequencies and convert back to probabilities if wanted.</p>
<ul><li>
A weather forecast is generally right. When it forecasts ‘rain’, 90% of the time it rains. When it forecasts ‘no rain’, 70% of the time it does not rain. In a typical September they forecast rain on two-thirds days and no rain on one-third of days.
<ul><li> How many days would you expect it to rain each September? </li>
<li>What is the probability that a random day in September is not rainy? </li>
</ul></li>
</ul><div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/rain-tree.png" width="605" height="361" alt="rain-tree.png" /><p class="caption">Probability and expected frequency trees for forecasting rain
</p>
</div>
<p>From the expected frequency tree, we expect it to rain on a total of 18+3=21 days in September, and not rain in 9. So the probability that a random day in September is not rainy is 9/30 = 0.3.</p>
<p>To get this result directly from the probabilities is not straightforward. </p>
<p>We can also represent the expected frequencies as a two-way table or a Venn diagram. </p>
<p> <img src="/sites/understandinguncertainty.org/files/rain-table.png" width="538" height="181" alt="rain-table.png" /></p>
<p><img src="/sites/understandinguncertainty.org/files/rain-square.png" width="479" height="389" alt="rain-square.png" /></p>
<p><img src="/sites/understandinguncertainty.org/files/rain-venn.png" width="376" height="312" alt="rain-venn.png" /></p>
<ul><li> A fair coin is flipped to decide whether your cricket team is going to bat first or second – heads you bat first, tails you bat second. If you bat first, your team wins 80% of the time. If you bat second, you win 50% of the time.
<ul><li>Out of 100 games, how many do you bat first in?</li>
<li>Out of 100 games, how many do you bat first, and then win?</li>
<li>Out of 100 games, how many do you win?</li>
<li>Before you flip the coin, what is the probability of you winning the game?</li>
</ul></li>
<li>100 students are suspected of cheating in an exam. They are wired up to a lie detector that will go ping! If it thinks you are lying. The people who make the detector claim that, if you are lying, there is a 90% chance the machine will go ping!. If you are genuinely not lying, there is a 10% chance the machine will get it wrong and go ping! Suppose 10 of the students have really been cheating. For how many students will the machine go ping!?
</li>
</ul><h3>4. Inverse probabilities. </h3>
<p>This is where things can get a bit tricky, but using expected frequency representations allows students to tackle some of the classic non-intuitive probability problems – essentially Bayes theorem. If they can do these, they have learnt a subtle and valuable skill. </p>
<ul><li>Weather forecasting: of the times it rains, what proportion did the forecast get it right?</li>
<li>
<ul><li>It rains 21 times, and in 18 the rain was forecast, so the proportion is 18/21 = 6/7: i.e. when it rains, there is 6/7 chance that the rain was forecast. Try doing that without using expected frequencies!!! Alternatively this is straightforward to read off the two-way table.</li>
</ul></li>
<li>Cricket: of the times you win your match, what proportion did you bat first?</li>
<li>Lie detector question – what is the chance, if the machine goes ‘ping!’, that the suspect has been cheating?</li>
</ul><h3>5. Using frequencies when teaching probability.<br /></h3>
<p>This is outlined by Jenny Gage and myself in our<a href="http://nrich.maths.org/probability"> NRich materials</a>, and in <a href="http://nrich.maths.org/content/id/9887/Gage,2012_ICME12.pdf">this paper</a>. The picture below shows part of the process of generating a two-way table by combining events represented by coloured bricks. From these empirical frequency distributions it is straightforward to go to expected frequency distributions, and hence to probabilities, using the process outlined above.</p>
<div class="captionCentre">
<img src="/sites/understandinguncertainty.org/files/cubes.jpg" width="314" height="234" alt="cubes.jpg" /><p class="caption">Results of experiments in which joint events are represented by pairs of coloured bricks</p>
</div>
<p>Using pairs of bricks to represent joint events: these can then be arranged as a two-way table, as above, or as a frequency tree.</p>
<p>Additional resources: </p>
<p>NRich materials<br /><a href="http://nrich.maths.org/probability">http://nrich.maths.org/probability</a></p>
<p>Jenny Gage paper at 1th ICME<br /><a href="http://nrich.maths.org/content/id/9887/Gage,2012_ICME12.pdf">http://nrich.maths.org/content/id/9887/Gage,2012_ICME12.pdf</a></p>
<p>Angela Fagerlin, Brian J. Zikmund-Fisher and Peter A. Ubel, Helping Patients Decide: Ten Steps to Better Risk Communication<br /><a href="http://jnci.oxfordjournals.org/content/103/19/1436.full.pdf+html">http://jnci.oxfordjournals.org/content/103/19/1436.full.pdf+html</a></p>
<p>Kurz-Milcke, E., Gigerenzer, G., & Martignon, L. (2008). Transparency in risk<br />
communication: Graphical and analog tools. Annals of the New York Academy of<br />
Sciences, 1128, 18–28.<br /><a href="http://library.mpib-berlin.mpg.de/ft/ek/EK_Transparency_2008.pdf">http://library.mpib-berlin.mpg.de/ft/ek/EK_Transparency_2008.pdf</a></p>
<p>Gigerenzer, G., & Hoffrage, U. (1995). How to Improve Bayesian Reasoning Without Instruction: Frequency Formats. Psychological Review, 102(4), 684-704.<br /><a href="http://library.mpib-berlin.mpg.de/ft/gg/GG_How_1995.pdf">http://library.mpib-berlin.mpg.de/ft/gg/GG_How_1995.pdf</a></p>
<p>Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz, L. M., & Woloshin, S. (2007). Helping doctors and patients make sense of health statistics. Psychological science in the public interest, 8(2), 53-96.<br /><a href="http://www.psychologicalscience.org/journals/pspi/pspi_8_2_article.pdf">http://www.psychologicalscience.org/journals/pspi/pspi_8_2_article.pdf</a></p>
<p>Use of natural frequencies and frequency trees in modern health communication – breast cancer screening leaflets<br /><a href="http://www.cancerscreening.nhs.uk/breastscreen/publications/ia-02.html">http://www.cancerscreening.nhs.uk/breastscreen/publications/ia-02.html</a></p>
</div></div></div>Sat, 13 Sep 2014 10:55:34 +0000david7749 at http://understandinguncertainty.orghttp://understandinguncertainty.org/using-expected-frequencies-when-teaching-probability#commentsAnother tragic cluster - but how surprised should we be?
http://understandinguncertainty.org/another-tragic-cluster-how-surprised-should-we-be
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Sadly another passenger plane crashed yesterday - the third in 8 days, the Air Algerie flight on July 24th, the TransAsia flight in Taiwan on July 23rd, and Malaysian Airlines in Ukraine on July 17th. Does this mean that flying is becoming more dangerous and we should keep off planes? The following analysis may appear cold-hearted, but is not intended to diminish the impact of this tragic loss on the people and families involved.</p>
<p>The <a href="http://www.planecrashinfo.com">Plane Crash Info</a> website contains the summaries of these three accidents - this site makes powerful reading and is not for those with a fear of flying. Their <a href="http://www.planecrashinfo.com/cause.htm">Statistics</a> page is full of useful information, including a graph showing a clear decline in the rate of accidents over the last 40 years: the 9/11 events in 2001 do not even make a blip in the graph.</p>
<p>However, it shows that flying can still carry some danger. 91 commercial flights containing 18 or more passengers have crashed in the previous 10 years (2004 to 2013), a rate of one every 40 days on average. So how surprising is it that 3 should happen in a space of 8 days?</p>
<p>A similar question was asked last November, when 6 cyclists were killed in London over 2 weeks, and Jody Aberdein and I <a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2013.00715.x/abstract">wrote a paper</a> on this: the methods are explained <a href="http://understandinguncertainty.org/when-cluster-real-cluster">here</a>. We can apply the same ideas to the 'cluster' of plane crashes, although of course this analysis is rather simplistic and ignores the undoubted variation in risk when flying in different parts of the world.</p>
<p>Consider any window of 8 days. If planes crash in an entirely unpredictable way at a rate of 91 over 10 years (3650 days), then we would expect 8 * 91/3650 = 0.2 crashes in any particular 8-day window. So assuming a Poisson distribution, the chance of at least 3 crashes in an 8-day window is around 1 in 1000 - very small indeed. So it is very surprising that there would be 3 or more crashes between July 17th and July 25th 2014.</p>
<p>But this is not the right question to ask. We should be concerned with whether such a 'cluster' is surprising over some period, say 10 years. In 10 years there are 456 non-over-lapping 'windows' of 8 days, and the chance that <em>at least one</em> of these contains at least 3 crashes = 1 - the chance that that <em>none</em> of them has at least three crashes = 1 - 0.999^456 = 0.41 (without rounding). And the more complex 'scan-statistic' adjustment, that allows for a sliding rather than non-overlapping windows, puts this chance up to 0.59.</p>
<p>So there is around a 6 in 10 chance that we should see such a large cluster over a 10-year period. In fact, as the graph below shows, the most likely maximum number of crashes of commercial planes with over 18 passengers in any 8-day window over 10 years is exactly ..... 3.<br /><img src="/sites/understandinguncertainty.org/files/plane-crash.png" width="673" height="498" alt="plane-crash.png" /></p>
<p>It is difficult to know how to interpret this - our emotions are rightly influenced by the awful nature of these events and the suffering they have caused. But personally, I hope it will make me no more nervous about flying than I am at the moment (and I have to admit I am not that keen to start with). </p>
<p>[Edit 10.02 July 25th: I had initially stated the adjusted probability stayed at 0.41: on checking the code I realised it changed to 0.59]</p>
</div></div></div>Fri, 25 Jul 2014 06:01:02 +0000david7679 at http://understandinguncertainty.orghttp://understandinguncertainty.org/another-tragic-cluster-how-surprised-should-we-be#commentsUsing metrics to assess research quality
http://understandinguncertainty.org/using-metrics-assess-research-quality
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The Higher Education Funding Council for England (HEFCE) is carrying out an <a href="http://www.hefce.ac.uk/whatwedo/rsrch/howfundr/metrics/">independent review of the role of metrics in research assessment,</a> and are encouraging views. I have submitted a (very personal) response, using HEFCE's suggested headings, which is given below in a minimally-edited version.</p>
<p>++++++++++++++++++++++++++++++</p>
<p>You will be getting a lot of detailed reasoned arguments about this topic, so I thought I would provide a more personal perspective from someone whose has done very well out of metrics.</p>
<h4><strong>Identifying useful metrics for research assessment:</strong><br /></h4>
<p>I am a statistician, and so I love metrics. I follow <a href="http://scholar.google.co.uk/citations?user=oz7MFu0AAAAJ&hl=en ">my Google Scholar profile</a> with interest. By any metric, I have been extremely successful. On Google Scholar I have 64000 citations, h-index of 85, and on Web of Science I have 32000 citations, h-index of 63. I follow Altmetrics, and have over 8000 followers on Twitter. All this has done me very well in my career - I have more letters after my name than you can shake a stick at.</p>
<p>Nevertheless I am strongly against the suggestion that peer–review can in any way be replaced by bibliometrics. </p>
<h4><strong>How should metrics be used in research assessment?</strong><br /></h4>
<p>My own experience shows some of the problems. My highest-cited paper clocks in at over 14,000, and yet it has roughly 150 authors and to be honest I have forgotten what, if anything, I contributed. How would these citations be shared out? Or the WinBUGS User manuals for software: around 5000 citations that do not even appear in WoS. Looking at my own record, I can see a correlation between metrics and the quality and importance of the work, but it is not large enough to use to replace judgement.</p>
<p>Clearly metrics should be collected and should be available to peers making judgements about the quality of research work. However they are only ‘indicators’, and not direct ‘measures’ of quality.</p>
<h4><strong>‘Gaming’ and strategic use of metrics</strong>:<br /></h4>
<p>I have done very well out of metrics, and although this is not because of deliberate gaming, I can see that my particular approach to research has paid off. I have tended to go for attractive and novel, even ‘sexy’ areas of statistics (believe it or not, such things do exist). I have got into a field early, not necessarily doing the best work, but reaping citation benefits later, mainly from people who have never read the original paper.</p>
<p>I have spent much of my career working on performance indicators in health and education, where it is finally being recognised that a past move towards apparently ‘simpler’ metrics was accompanied by massive gaming and distortions of practice. The Mid-Staffs scandal could be said to have directly arisen due to an obsession with a few indicators, at the cost of reduced attention to the whole system: fortunately judgements about hospitals have now moved away from a few targets and indicators to a more holistic system. </p>
<p>There has been a disastrous confusion between ‘indicators’ and ‘measures’, and it would be a retrograde step to see this being played out in research assessment. </p>
<h4><strong>Making comparisons:</strong><br /></h4>
<p>The difficulty with making comparisons is illustrated by the Google Scholar listing for researchers under <a href="http://scholar.google.co.uk/citations?view_op=search_authors&hl=en&mauthors=label:statistics">‘Statistics’. </a></p>
<p>I am currently lying 9th in the world, although I am fully aware that some people such as David Cox or Martin Bland do not feature. It is interesting to look at the top scorers – these include people who come from areas that I would not consider ‘statistics’, eg particle physics, or write tutorial articles for doctors, or have published in boundary areas such as machine learning. No doubt all these authors are excellent (although I am unsure about the individual who seems to have other people’s publications included under their own name), but this shows the problems of delineating a ‘subject’ in an automatic way. </p>
<p>To summarise, I feel that metrics should definitely be collected, but only used as additional evidence in a professional judgement as to the quality of research output.</p>
</div></div></div>Tue, 24 Jun 2014 10:23:25 +0000david7644 at http://understandinguncertainty.orghttp://understandinguncertainty.org/using-metrics-assess-research-quality#commentsNumbers and the common-sense bypass
http://understandinguncertainty.org/numbers-and-common-sense-bypass
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Yesterday the <a href="http://www.thesundaytimes.co.uk/sto/news/uk_news/Science/article1422837.ece?CMP=OTH-gnws-standard-2014_06_14">Sunday Times [paywall]</a> covered a talk Anne Johnson and I had given at the Cheltenham Science Festival about the statistics of sex, and the article said </p>
<blockquote><p>more people are having sex in their teens, roughly 30% before the age of 16.</p></blockquote>
<p>Let’s leave aside whether this is an accurate statistic or not, and simply look at what happened when the Daily Mail lifted this material into <a href="http://www.dailymail.co.uk/femail/article-2658291/Amorous-Brits-make-love-2-500-times-MINUTE-includes-time-sleep.html?ITO=1490&ns_mchannel=rss&ns_campaign=1490&utm_source=twitterfeed&utm_medium=twitter">an article of its own</a>. They made a number of errors, but the cracker was when the statement by the Sunday Times got turned into the remarkable headline: </p>
<blockquote><p>30 per cent of total sexual encounters take place before 16.</p></blockquote>
<p> And just in case they change their website, here is the evidence (4th bullet point):</p>
<p><img src="/sites/understandinguncertainty.org/files/mail-sex-quote.jpg" width="520" height="500" alt="mail-sex-quote.jpg" /></p>
<p>A little reflection should show that the Mail’s statement is more than implausible. 30% of all sex occurring before 16? Just think about it. The Daily Mail clearly didn't.</p>
<p>For those that would like some evidence, the article reports my estimate, based on the <a href="http://www.natsal.ac.uk">National Survey of Sexual Attitudes and Lifestyles</a> (NATSAL), that male+female couples in Britain have sex around 900,000,000 times a year. So if if 30% of this were in the under 16’s, that would be about 300,000,000 times a year. There are about 1,500,000 14 and 15 year olds, that’s 750,000 potential couples, so to get to this total they would all have to be having sex 400 times a year, which is more than once a day. No wonder they don’t have time for homework. Or maybe this number is just ridiculous.</p>
<p>This could be just an enthusiastic sub-editor, such as the one who produced the wonderful headline below<br /><img src="/sites/understandinguncertainty.org/files/bikers-kent_0.jpg" width="600" height="400" alt="bikers-kent.jpg" /></p>
<p> - this was rapidly changed to the more reasonable “<a href="http://www.courier.co.uk/Bikers-involved-Kent-road-accidents/story-18653710-detail/story.html">Bikers involved in more than one third of serious Kent road accidents</a>“, but not before someone had grabbed the previous version.</p>
<p>But in the Mail’s case it was not just the sub-editor who wrote the headline – the journalist made the claim in the article. So how can such an idiotic statement appear in a national newspaper? </p>
<p>This is not intended to be a standard ‘aren’t the media hopelessly innumerate’ bash, fun though those are. I am genuinely interested in how intelligent people can write such statements without seeming to engage their common sense (which I generously assume they have). </p>
<p>Perhaps the first thing to note is that the two errors above are both of an identical logical nature – the so-called ‘transposed conditional’. Let's consider some pairs of statements, the first of which is reasonable, the second is in error:</p>
<ul><li>30% of people, when they were aged under 16, had sex</li>
<li>30% of sex happens with people aged under 16</li>
</ul><ul><li>One third of fatal accidents involved motorcyclists</li>
<li>One third of motorcyclists have fatal accidents</li>
</ul><ul><li>90% of women with breast cancer get a positive mammography</li>
<li>90% of women with a positive mammography have breast cancer</li>
</ul><p>(In fact, <a href="http://www.informedchoiceaboutcancerscreening.org/wp-content/uploads/2013/05/Breast-screening-leaflet_8August2013_Final_ready-for-print-version.pdf"> the current breast screening leaflets</a> point out that fortunately only around 25% of women with a positive mammography have breast cancer).</p>
<p>In more abstract terms, what happens is that the “proportion of A that are also B”, is reported as “the proportion of B that are also A”. This is also known as the <a href="https://en.wikipedia.org/wiki/Prosecutor's_fallacy">Prosecutor’s Fallacy </a>, as it is a mistake made in legal cases. It is extremely dangerous to mix up the statements</p>
<ul><li>The probability of the evidence, if the suspect is innocent, is 1 in 1,000,000</li>
<li>The probability of the suspect being innocent, given this evidence, is 1 in 1,000,000</li>
</ul><p>and yet this mistake has happened repeatedly, if implicitly. </p>
<p><em>[See additional comment at the bottom of this article, added June 18th]<br /></em></p>
<p>One argument is that it is simple innumeracy: the so-called ‘deficit model’ explanation, that could be counteracted by better education in the mechanics of mathematics. But I am sure that most of us who make these kind of mistakes (and I am not excluding anyone here, including me) are functionally numerate, and could even have a stab at working out a 15% tip. </p>
<p>Another argument is that this really not an issue with numeracy, but a simple error in logic. And yet it might be reasonable to assume that a journalist, or a judge, would not confuse the following two statements </p>
<ul><li>All dogs are furry mammals with 4 legs</li>
<li>All furry mammals with 4 legs are dogs.</li>
</ul><p>So maybe it is something in the middle: an inability to combine reason with numbers, some kind of paralysis that comes when confronted with numerical arguments that means that ordinary common sense is bypassed. I see this when tutoring young people for GCSE maths: intelligent kids who when asked to do some maths couched as a pseudo-real-world problem, (“<em>Fred travels at 50 mph for 30 minutes, how far does he go?</em>”) go into a mental panic, start using formulae at random, and come up, like Baldrick doing mental arithmetic, with some absurd answer (“<em>1500 miles</em>”). And yet if I asked them the same problem in the real-real-world, and did not say it was maths, they would be able to get the answer by using some basic reasoning (“<em>25 miles</em>”). The kind of maths teaching promoted by Tim Gowers on <a href="http://gowers.wordpress.com/2012/06/08/how-should-mathematics-be-taught-to-non-mathematicians/">how maths should be taught to non-mathematicians</a> seeks to avoid this ‘<em>find the formula and plug in the numbers</em>’ style, and with luck the <a href="https://www.gov.uk/government/publications/16-to-18-core-maths-qualifications">Core Maths curriculum</a> will feature more reasoning with practical situations.</p>
<p>One consequence of this inability to take a sensible critical attitude to numbers is that opinions are pushed to the extremes: numbers are to be either accepted, and even fetishised, as some sort of God-given truth, or rejected out of hand as ‘just statistics’. Possibly in the same breath. Just listen to the Today programme or Question Time.</p>
<p>Of course there are other areas in which common sense is bypassed - when we may be only too willing to suspend our normal powers of criticism and warmly embrace delusion. These include claims for alternative therapies, arguments by populist politicians, optimistic prognoses for desperately ill loved-ones, or bigging up England’s performance in the World Cup. Sadly, in all these cases some realism may be more appropriate.</p>
<h3>Additional comment added June 18th<br /></h3>
<p>An equivalent way to view this error is in terms of the 'wrong denominator': is the 30% a proportion of people, or of all sexual activity? Gerd Gigerenzer emphasises that these mistakes are due to not being clear about the 'reference class', i.e. 30% of what?. Ambiguity can be avoided by always making the class clear by saying, for example, "Out of every 100 people reaching 16, 30 have already had sex".</p>
</div></div></div>Mon, 16 Jun 2014 07:18:43 +0000david7621 at http://understandinguncertainty.orghttp://understandinguncertainty.org/numbers-and-common-sense-bypass#commentsA heuristic for sorting science stories in the news
http://understandinguncertainty.org/heuristic-sorting-science-stories-news-0
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p> Dominic Lawson's <a href="http://www.thesundaytimes.co.uk/sto/comment/columns/dominiclawson/article1417214.ece ">article in the Sunday Times today</a>[paywall] quotes me as having the rather cynical heuristic: "<em>the very fact that a piece of health research appears in the papers indicates that it is nonsense</em>." I stand by this, but after a bit more consideration I would like to suggest a slightly more refined version for dealing with science stories in the news, particularly medical ones.</p>
<blockquote><p>"Ask yourself: if the study had come up with a negative result, would I be hearing about it? If NO, then don't bother to read or listen to the story"
</p></blockquote>
<p>The immediate impulse behind Lawson's article was a spate of studies claiming associations between ordinary daily habits and future bad outcomes: <a href="http://www.telegraph.co.uk/health/healthnews/10862512/Three-slices-of-white-bread-a-day-linked-to-obesity.html">eating a lot of white bread with becoming obese</a>, <a href="http://www.bbc.co.uk/news/health-27603587">being cynical with getting dementia</a>, <a href="http://www.bbc.co.uk/news/health-27617615">light bedrooms with obesity (again)</a>. All these stories associate mundane exposures with later developing dread outcomes,<em> i.e.</em> the classic '<a href="http://www.dailymail.co.uk/health/article-2019170/Can-cat-cancer-Parasite-bellies-linked-brain-tumours.html">cats cause cancer</a>' type. My argument is that, since we would not be reading about a study in which these associations had <em>not</em> been found, we should take no notice of these claims.</p>
<p>Why my cynicism? There has been a lot of public discussion of potential biases in the published scientific literature – see for example, commentaries in the <a href="http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-degree-it-not-trouble ">Economist</a> and <a href="http://www.forbes.com/sites/henrymiller/2014/01/08/the-trouble-with-scientific-research-today-a-lot-thats-published-is-junk/ ">Forbes magazine</a>. The general idea is that by the time research has been selected to be submitted, and then selected for publication, there is a good chance the results are false positives: for a good review of the evidence for this see <em><a href="http://simplystatistics.org/2013/12/16/a-summary-of-the-evidence-that-most-published-research-is-false/ ">‘A summary of the evidence that most published research is false’</a></em>. There is also an excellent <a href="http://deevybee.blogspot.co.uk/2014/01/why-does-so-much-research-go-unpublished.html ">blog by Dorothy Bishop</a> on why so much research goes unpublished.</p>
<p>The point of this blog is to argue that such selection bias is as nothing compared to the hurdles overcome by stories that are not only published, but <em>publicised</em>. For a study to be publicised, it must have</p>
<p>• Been considered worthwhile to write up and submit to a journal or other outlet<br />
• Have been accepted for publication by the referees and editors<br />
• Been considered ‘newsworthy’ enough to deserve a press release<br />
• Been sexy enough to attract a journalist’s interest<br />
• Got past an editor of a newspaper or newsroom.</p>
<p>Anything that gets through all these hurdles stands a huge chance of being a freak finding. In fact, if the coverage is on the radio, I recommend sticking your fingers in your ears and loudly saying ‘la-la-la’ to yourself.</p>
<p>The crucial idea is that since there is an unknown amount of evidence that I am not hearing about and that would contradict this story, there is no point in paying attention to whatever it is claiming. It is like watching a video of a football team scoring goals, and then suddenly realising that you are only being shown the 'successes' and not the ones they let in: the evidence just shows that they are capable of scoring, but not whether they score more than they concede. So, if you're interested in assessing the quality of the team, stop watching the video [of course if you just enjoy the spectacle, carry on].</p>
<p>The heuristic is even more appropriate when you hear or read of any survey by any organisation, particularly charities.</p>
<p>This all may seem rather cynical, and keep in mind that I am a grumpy old git (although now trying to avoid cynicism, as I have no wish to become demented). But just think of the time you can save.</p>
<p>[Added 2nd June 2014: I should have made clear that I am only talking about <em>single</em> studies: proper reviews of the totality of evidence should be listened to. So this is not an excuse to ignore evidence connecting smoking and lung cancer.]</p>
<p>PS A recent study argues that <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0085355 ">newspapers preferentially cover medical research with weaker methodology</a>. However I must apply my own heuristic to this: would I have heard about it if the researchers had found out the opposite? And you should ask yourself, would I be telling you about it?</p>
<p>PPS I have been struggling to find a suitable name for this heuristic, perhaps with some literary or classical allusion to someone who was misled by only being told selected items of information. Perhaps the ‘Siddhartha’ heuristic? Siddhārtha Gautama was a prince who was only told good news, and protected from seeing suffering and death. But he finally realised that he was not seeing the world as it really was, and so he left his palace to first take on the life as a wandering ascetic, and eventually to become the Buddha. </p>
</div></div></div>Sun, 01 Jun 2014 11:23:42 +0000david7593 at http://understandinguncertainty.orghttp://understandinguncertainty.org/heuristic-sorting-science-stories-news-0#commentsIt's cherry-picking time: more poorly reported science being peddled to journalists
http://understandinguncertainty.org/its-cherry-picking-time-more-poorly-reported-science-being-peddled-journalists
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Yesterday the <a href="http://www.dailymail.co.uk/health/article-2582774/TV-computer-games-wreck-family-life-leave-child-obese-study-warns.html">Daily Mail</a> trumpeted “<em>For every hour of screen time, the risk of family life being disrupted and children having poorer emotional wellbeing may be doubled</em>”, while the Daily Telegraph said that <em>"for every hour each day a child spent in front of a screen, the chance of becoming depressed, anxious or being bullied rose by up to 100 per cent”</em>. These dramatic conclusions come from a <a href="http://archpedi.jamanetwork.com/article.aspx?articleid=1844044">study whose abstract states</a> – </p>
<blockquote><p>Across associations, the likelihood of adverse outcomes in children ranged from a 1.2- to 2.0-fold increase for emotional problems and poorer family functioning for each additional hour of television viewing or e-game/computer use depending on the outcome examined.</p></blockquote>
<p>Unfortunately this is, pure and simply, wrong. And the articles in the papers are misleading too - although note the use of ‘<em>may be doubled</em>’ from the Mail, and ‘<em>up to 100 per cent</em>’ from the Telegraph, which appears to allow them to cherry-pick their evidence as much as they want. So, leaving aside design and analysis issues such as needlessly breaking outcome scales into ‘high’ and ‘low’, what is wrong with the reporting in the paper? </p>
<p>Table 3 of the paper is reproduced below. It shows 96 estimates with 95% confidence intervals. The authors focus on reporting the results that are ‘significantly high’, that is whose 95% intervals lie above 1, of which there are 11, shown in bold and scattered rather haphazardly across the Table.</p>
<p><img src="/sites/understandinguncertainty.org/files/screentime.jpg" width="875" height="792" alt="screentime.jpg" /></p>
<p>But there also appear to be 2 ‘significantly low’ odds ratios, whose 95% intervals lie below 1, and 83 odds ratios which are not significantly different from 1. In fact the distribution of odds ratios forms a distribution around 1: apart from one odds ratio of 2 (which is very imprecise, with an interval from 1 to 4), they all lie between 0.7 and 1.3. </p>
<p>Out of 96 95%-intervals, we would expect around 5 to exclude 1 by chance alone, even if there were no effect. In fact there were 13, suggesting the possibility of a small overall effect, but nowhere near the ‘doubling’ claimed. One would also have to believe that all the many confounding factors that would simultaneously influence TV watching and later behaviour had been fully accounted for. Which is extremely doubtful. Maybe watching lots of TV when young does contribute to later problems – it seems quite plausible – but this study does not show it.</p>
<p>The crucial insight is that the estimated odds ratios for adverse outcomes range between 0.7 and 2, and not 1.2 and 2 as claimed by the authors in their abstract. Focusing on only the ‘significant’ positive results is, either deliberately or through ignorance, very poor and deeply misleading science. It also shows dismal refereeing. In the text the authors acknowledge that "Few associations were evident", but this does not make it through to the abstract.</p>
<p>Journalists would not have noticed this paper unless it had been press-released by the academic journal. So when you read about some poor science, don’t jump to blame the journalists: it could well be because of the efforts of some scientists, institutions and journals to promote coverage of their activities, regardless of their true quality and importance. Sadly, this behaviour harms scientific credibility.</p>
<p>Postscript<br />
There is also a potential technical problem in that the authors interpret an odds ratio of 2 as doubling the ‘likelihood’. But if an outcome measures has a base-rate of around 50%, or odds of 1:1, an odds ratio of 2 multiplies the odds up to 2:1, or 66%. So the risk goes up from 50% to 66%, but is not doubled. In fact in the case of the 'emotional problems' scale the baseline risk is around 11%, and so an odds ratio of 2 does roughly double the risk.</p>
</div></div></div>Tue, 18 Mar 2014 21:29:05 +0000david7508 at http://understandinguncertainty.orghttp://understandinguncertainty.org/its-cherry-picking-time-more-poorly-reported-science-being-peddled-journalists#commentsMore deaths due to climate change? Or maybe not.
http://understandinguncertainty.org/more-deaths-due-climate-change-or-maybe-not
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Coverage of a <a href="http://jech.bmj.com/content/early/2014/01/08/jech-2013-202449.short?g=w_jech_ahead_tab">paper</a> just published by Journal of Epidemiology and Community Health included dramatic headlines such as the Guardian's <a href="http://www.theguardian.com/environment/2014/feb/04/heat-related-deaths-climate-change">Heat-related deaths in the UK will rise 257% by 2050 because of climate change</a>. But a closer look at the numbers in the paper paints a rather different picture.</p>
<p>Figure 4 of the paper shows the number of deaths expected per 100,000 people in each category, and how the authors estimate this will change into the 2080s. </p>
<p><img src="/sites/understandinguncertainty.org/files/climate-temp-deaths_0.png" width="660" height="264" alt="climate-temp-deaths_0.png" /></p>
<p>But the vertical axes for the two plots are different, and they should perhaps have been drawn like this.</p>
<p><img src="/sites/understandinguncertainty.org/files/climate-temp-R.jpeg" width="673" height="498" alt="climate-temp-R.jpeg" /></p>
<p>Or even added in a 'combined plot' [added 6th February 2013]</p>
<p><img src="/sites/understandinguncertainty.org/files/climate-deaths.jpeg" width="673" height="498" alt="climate-deaths.jpeg" /></p>
<p>This clearly reveals that, in terms of rate per 100,000, the decline in cold-related death rate easily outweighs the increase in the heat-related death rate. So overall, for any individual in the UK, the risk of a temperature-related death is expected to fall steadily due to climate change. Bring it on! </p>
<p>But since there are going to be more old people in the future, the absolute numbers of deaths is going to increase - and this number was emphasised by the authors and got the headlines.</p>
<p>The abstract of the paper includes the phrase <em>"The increased number of future temperature-related deaths was partly driven by projected population growth and ageing." </em> According to the projections in the paper, if the population make-up did not change, the overall mortality risk would go down. So it would have been more accurate to say <em>"The increased number of future temperature-related deaths was <strong>wholly</strong> driven by projected population growth and ageing."</em>. </p>
<p>But that is clearly not the message that the authors wanted to convey. It is unfortunate that this kind of presentation gives ammunition to those who say that the effects of climate change are being exaggerated.</p>
</div></div></div>Tue, 04 Feb 2014 11:08:52 +0000david7444 at http://understandinguncertainty.orghttp://understandinguncertainty.org/more-deaths-due-climate-change-or-maybe-not#commentsHow surprising was the cluster of cycle deaths in London?
http://understandinguncertainty.org/how-surprising-was-cluster-cycle-deaths-london
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p><a href="http://www.bbc.co.uk/programmes/b03mjcbk">More or Less </a>recently featured Jody Aberdein talking about the cluster of 6 cycle deaths in London over a 2 week period. </p>
<p>The paper with the details of the analysis can, for a while, be freely obtained from <a href="Http://onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2013.00715.pdf">Significance magazine</a>.</p>
<p>Details of the statistical methods are given <a href="http://understandinguncertainty.org/when-cluster-real-cluster">here</a> - these are necessarily quite complex due to the need to allow for all possible 2 week periods. </p>
</div></div></div>Sat, 04 Jan 2014 06:43:41 +0000david7382 at http://understandinguncertainty.orghttp://understandinguncertainty.org/how-surprising-was-cluster-cycle-deaths-london#commentsPISA statistical methods - more detailed comments
http://understandinguncertainty.org/pisa-statistical-methods-more-detailed-comments
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>In the Radio 4 documentary <em><a href="http://www.bbc.co.uk/iplayer/episode/b03j9mx2/PISA_Global_Education_Tables_Tested/ ">PISA - Global Education Tables Tested</a></em>, broadcast on November 25th, a comment is made that the statistical issues are a bit complex to go into. Here is a brief summary of my personal concerns: to get an idea of the feelings about PISA statistical methods, see for example an <a href="http://www.tes.co.uk/article.aspx?storycode=6344672">article</a> in the Times Educational Supplement, and the <a href="http://www.tes.co.uk/article.aspx?storycode=6345213">response</a> by OECD. </p>
<p>The PISA methodology is complex and rather opaque, in spite of the substantial amount of material published in the technical reports. Briefly:</p>
<ol><li>Individual students only answer a minority of questions.
</li>
<li>Multiple ‘plausible values’ are then generated for all students assuming a particular statistical model, essentially estimating what might have happened if the student had answered all the questions.
</li>
<li>These ‘plausible values’ are then treated as if they are the results of complete surveys, and form the basis of national scores (and their uncertainties) and hence rankings in league tables.
</li>
<li>But the statistical model used to generate the ‘plausible scores’ is demonstrably inadequate – it does not fit the observed data.
</li>
<li>This means the variability in the plausible scores is underestimated, which in turn means the uncertainty in the national scores is underestimated, and hence the rankings are even less reliable than claimed.
</li>
</ol><p>Here's a little more detail on these steps.</p>
<h3>1. Individual students only answer a minority of questions.<br /></h3>
<p>Svend Kreiner has <a href="http://www.tes.co.uk/article.aspx?storycode=6344672">calculated</a> that in 2006, about half did not answer any reading questions at all, while <em><em>"another 40 per cent of participating students were tested on just 14 of the 28 reading questions used in the assessment. So only approximately 10 per cent of the students who took part in Pisa were tested on all 28 reading questions."</em></em></p>
<h3>2. Multiple ‘plausible values’ are then generated for all students assuming a particular statistical model<br /></h3>
<p>A simple Rasch model (<a href="http://www.oecd.org/edu/school/programmeforinternationalstudentassessmentpisa/pisa2009technicalreport.htm ">PISA Technical Report </a>, Chapter 9) is assumed, and five values for each student are generated at random from the 'posterior' distribution given the information available on that student. So for the half of students in 2006 who did not answer any reading questions, five 'plausible' reading scores are generated on the basis of their responses on other subjects.</p>
<h3>3. These ‘plausible values’ are then treated as if they are the results of surveys with complete data on all students<br /></h3>
<p>The Technical Report is not clear about how the final country scores are derived, but the <a href="http://www.oecd.org/pisa/pisaproducts/pisadataanalysismanualspssandsassecondedition.htm ">Data Analysis manual</a> makes clear that these are based on the five plausible values generated for each student: they then use standard methods to inflate the sampling error to allow for the use of 'imputed' data.</p>
<blockquote><p>“Secondly, PISA uses imputation methods, denoted plausible values, for reporting student performance. From a theoretical point of view, any analysis that involves student performance estimates should be analysed five times and results should be aggregated to obtain: (i) the final estimate; and (ii) the imputation error that will be combined with the sampling error in order to reflect the test unreliability on the standard error.</p>
<p>All results published in the OECD initial and thematic reports have been computed accordingly to these methodologies, which means that the reporting of a country mean estimate and its respective standard error requires the computation of 405 means as described in detail in the next sections.”</p></blockquote>
<p>There does seem to be some confusion in the PISA team about this - in my interview with Andreas Schleicher, I explicitly asked whether the country scores were based on the 'plausible values', and he appeared to deny that this was the case.</p>
<h3>4. The statistical model used to generate the ‘plausible scores’ is demonstrably inadequate.<br /></h3>
<p>Analysis using imputed ('plausible') data is not inherently unsound, provided (as PISA do) the extra sampling error is taken into account. But the vital issue is that the adjustment for imputation is only valid if the model used to generate the plausible values can be considered 'true', in the sense that the generated values are reasonably 'plausible' assessments of what that student would have scored had they answered the questions. </p>
<p>A simple Rasch model is assumed by PISA, in which questions are assumed to have a common level of difficulty across all countries - questions with clear differences are weeded out as “dodgy”. But in a<a href="http://link.springer.com/article/10.1007%2Fs11336-013-9347-z"> paper in Psychometrika</a>, Kreiner has shown the existence of substantial Differential Item Functioning” (DIF) - i.e. questions have different difficulty in different countries, and concludes that the <em>“The evidence against the Rasch model is overwhelming.”</em></p>
<p> The existence of DIF is acknowledged by <a href="http://www.oecd.org/pisa/47681954.pdf ">Adams</a> (who heads the OECD analysis team), who says <em>“The sample sizes in PISA are such that the fit of any scaling model, particularly a simple model like the Rasch model, will be rejected. PISA has taken the view that it is unreasonable to adopt a slavish devotion to tests of statistical significance concerning fit to a scaling model.”</em>. Kreiner disagrees, and argues that the effects are both statistically significant and practically important.</p>
<h3>5. This means the variability in the plausible scores is underestimated<br /></h3>
<p>The crucial issue, in my view, is that since these 'plausible values' are generated from an over-simplified model, they will not represent plausible values as if the student really had answered all the questions. <a href="http://link.springer.com/article/10.1007/s11336-013-9347-z ">Kreiner</a> says <em>“The effect of using plausible values generated by a flawed model is unknown”.</em></p>
<p><em>[The next para was in the original blog, but I have revised my opinion since - see note below]</em> I would be more confident than this, and would expect that the 'plausible values' will be ‘under-dispersed’, ie not show a reasonable variability. Hence the uncertainty about all the derived statistics, such as mean country scores, will be under-estimated, although the extent of this under-estimation is unknown. It is notable that PISA acknowledge the uncertainty about their rankings (although this is not very prominent in their main <a href="http://www.oecd.org/pisa/46643496.pdf">communications</a>), but the extra variability due to the use of potentally-inappropriate plausible values will inevitably mean that the rankings would be even less reliable than claimed. That is the reason for my scepticism about PISA's detailed rankings.</p>
<h3>Note added 30th November:<br /></h3>
<p>I acknowledge that plausible values derived from an incorrect model should, if analysed assuming that model, lead to exactly the same conclusions than if they had not been generated in the first place (and, say, a standard maximum likelihood analysis carried out). Which could make one ask - why generate plausible values in the first place? But in this case it is convenient for PISA to have ‘complete response’ data to apply their complex survey weighting schemes for their final analyses. </p>
<p>But this is the issue: it is unclear what effect generating a substantial amount of imputed data from a simplistic model will have, when those imputed data are then fed through additional analyses. So after more reflection I am not so confident that the PISA methods lead to an under-estimate of the uncertainty associated with the country scores: instead I agree with Svend Kreiner’s view that it is not possible to predict the effect of basing subsequent detailed analysis on plausible values from a flawed model.</p>
</div></div></div>Mon, 25 Nov 2013 17:30:05 +0000david7301 at http://understandinguncertainty.orghttp://understandinguncertainty.org/pisa-statistical-methods-more-detailed-comments#commentsComplaint about the Press Complaints Commission
http://understandinguncertainty.org/complaint-about-press-complaints-commission
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>What a strange organisation the Press Complaints Commission (PCC) is. They say that a press article is inaccurate, but consider it reasonable that the inaccurate headline remains uncorrected.</p>
<h3>Brief Timeline<br /></h3>
<ul><li>12th july 2013. <a href="http://www.nhs.uk/NHSEngland/bruce-keogh-review/Pages/published-reports.aspx">Keogh report on 14 hospitals</a> is due out. Professor Sir Brian Jarman provides data and <a href="https://www.dropbox.com/s/9o4caplpvcp1e6r/My%20email%20to%20Laura%20Donnelly%20%28data%20attachments%20removed%29.doc">briefs journalists </a>on above-average deaths in hospitals being investigated. He emphasizes that such deaths cannot be interpreted as ‘avoidable’.
</li>
<li>13th July. The <a href="http://www.telegraph.co.uk/health/heal-our-hospitals/10178296/13000-died-needlessly-at-14-worst-NHS-trusts.html">Sunday Telegraph leads with headline</a> ‘13,000 died needlessly at 14 worst NHS trusts’, in conflict with both Jarman’s advice, and what is said in the article itself.
</li>
<li>16th july 2013. Keogh review published, and explicitly states that<em> “It is clinically meaningless and academically reckless to use such statistical measures to quantify actual numbers of avoidable deaths.</em> Numerous criticisms of Telegraph coverage follow, including an article by me in the <a href="http://press.psprings.co.uk/bmj/august/needless.pdf">British Medical Journal</a>. A number of complaints are made to the PCC.
</li>
<li>1st November 2013. PCC finally announces that <em>“By attributing the number of “needless” deaths to a calculation made by Sir Brian Jarman, the newspaper had failed to take care not to publish inaccurate information in breach of Clause 1 (i). As such, a correction – published promptly and with due prominence – was required in accordance with the terms of Clause 1 (ii).”</em> This does not appear publicly. The Telegraph publishes a ‘clarification’, but the headline remains.
</li>
<li>4th November . I complain to PCC that the misleading Telegraph headline remains on their article, but am told that <em>“Taken in context with the article as a whole, and in light of the additional footnote, the Commission did not consider that a significantly misleading impression of the investigation’s findings had been created by the headline.”</em>. I find it very difficult to understand how they can come to this bizarre and illogical conclusion.
</li>
</ul><p>Another complainant has taken this to the independent reviewer of the PCC. But I am deeply unimpressed by the PCC’s feeble response to this ‘inaccurate’ (to be extremely generous) article. </p>
<p>Presumably the PCC will soon be abolished, and we can only hope that post-Leveson there will be a more effective body. But I haven’t put the bunting out yet.</p>
<p>PS<br />
18th November. Another grossly misleading headline, this time in the<a href="http://www.dailymail.co.uk/news/article-2509629/Decade-Labour-saw-50-000-die-hospital.html#ixzz2lO5x9HBp "> Daily Mail</a>, “ <em>Decade of Labour 'saw 50,000 too many die in hospital' </em>“. They put the inaccurate statement in quotes, as if someone has actually claimed this. But nobody said it - this quote is purely a product of the imagination of the sub-editors. </p>
</div></div></div>Sun, 24 Nov 2013 12:57:30 +0000david7293 at http://understandinguncertainty.orghttp://understandinguncertainty.org/complaint-about-press-complaints-commission#commentsPress Complaints Commission decide '13,000 needless deaths' story was inaccurate
http://understandinguncertainty.org/press-complaints-commission-decide-13000-needless-deaths-story-was-inaccurate
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>I was of a number of complainants to the Press Complaints Commission about the Sunday Telegraph story headlined <a href="http://www.telegraph.co.uk/health/heal-our-hospitals/10178296/13000-died-needlessly-at-14-worst-NHS-trusts.html "><em>13,000 died needlessly at 14 worst NHS trusts</em></a>, as the Telegraph journalists had been explicitly told by the originator of the figures, Professor Brian Jarman, that this was an inappropriate interpretation. My objections were expressed in an <a href="http://understandinguncertainty.org/files/2013bmj-needless.pdf">article in the British Medical Journal</a>. </p>
<p>The Press Complaints Commission has now told me that <em>“The Commission decided that the Sunday Telegraph had published significantly misleading information; however it had offered to take sufficient action to remedy the breach of the Code as required under the terms of Clause 1 (ii).”</em></p>
<p>This means that there is no official adjudication and no publication of the decision – this seems strange, and so I have reproduced below (with permission of the PCC) their decision.</p>
<p>The crucial finding was <em>“By attributing the number of “needless” deaths to a calculation made by Sir Brian Jarman, the newspaper had failed to take care not to publish inaccurate information in breach of Clause 1 (i). As such, a correction – published promptly and with due prominence – was required in accordance with the terms of Clause 1 (ii).”</em></p>
<p>Having published this inaccurate information, the Sunday Telegraph published some clarifications. <a href="http://www.telegraph.co.uk/health/heal-our-hospitals/10178296/13000-died-needlessly-at-14-worst-NHS-trusts.html ">The online version</a> does now has the 'clarification' at the bottom, but as the image below shows, still had [on November 4th] the inaccurate headline. Extraordinary.</p>
<p> <img src="/sites/understandinguncertainty.org/files/13000-headline.png" width="652" height="205" alt="13000-headline.png" /></p>
<blockquote><h3>Commission’s decision in the case of<br />
Groves v The Daily Telegraph/ The Sunday Telegraph</h3>
<p>The complainant considered that the newspapers had inaccurately reported that an investigation, overseen by Sir Bruce Keogh, had revealed that thousands of patients had died “needlessly” at 14 NHS hospital trusts. </p>
<p>Clause 1 (i) (Accuracy) of the Editors’ Code of Practice states that “the press must take care not to publish inaccurate, misleading or distorted information”. Clause 1 (ii) makes clear that “a significant inaccuracy, misleading statement or distortion once recognised must be corrected promptly and with due prominence”.</p>
<p>The Commission noted that Sir Bruce Keogh’s investigation into 14 NHS hospital trusts had been preceded by investigations into the Mid Staffordshire NHS Foundation Trust. These investigations – overseen by the Healthcare Commission and, more recently, by Robert Francis QC – had been launched due to the trust’s above-average Hospital Standard Mortality Ratio (HSMR), mortality statistics calculated by Sir Brian Jarman, Director of the Dr Foster Intelligence Unit. In light of Robert Francis’s conclusions at Mid Staffordshire, the Health Secretary and the Prime Minister instructed Sir Bruce to carry out a review of an additional 14 hospital trusts with “persistently high mortality rates”.</p>
<p>The Commission noted the complainant’s concern that the Daily Telegraph had misleadingly stated that “an investigation found that thousands of patients died needlessly because of poor care” and the Sunday Telegraph had inaccurately said “in total Sir Brian [Jarman] calculated that up to 13,000 patients died needlessly”. Indeed, in reference to HSMR and SHMI (Summary Hospital-level Mortality) statistics, Sir Bruce Keogh had said in his report that “it is clinically meaningless and academically reckless to use such statistical measures to quantify actual numbers of avoidable deaths”. He had also quoted Robert Francis QC, who had said “it is in my view misleading and a potential misuse of the figures to extrapolate from them a conclusion that any particular number, or range of numbers, of deaths were caused or contributed to by inadequate care”. </p>
<p>However, as also stated in the Keogh report, the Health Secretary and the Prime Minister had instructed Sir Bruce to carry out this review with the rationale that “high mortality rates at Mid Staffordshire NHS Foundation Trust were associated with failures in all three dimensions of quality – clinical effectiveness, patient experience, and safety – as well as failures in professionalism, leadership and governance”. Although the “excess deaths” had not been described as “needless” by Sir Bruce Keogh or Sir Brian Jarman, the newspapers had been entitled to their interpretation of the investigation’s results. </p>
<p>The Commission noted that when presenting complex statistical information to non-specialist readers, newspapers will inevitably have to summarise information. The Code does not require the publication of exhaustive information. However, the Commission made clear that it is essential that newspapers interpret such statistical information accurately, and in a manner which is not misleading. In this instance, it was for the Commission to consider whether, in the context of each article as a whole, the newspapers had made clear that the quoted numbers related to statistical analysis of above-average death rates; they did not reflect the outcome of a study into the causes of individual deaths.</p>
<p>However, in the Sunday Telegraph’s article, the newspaper had stated that Sir Brian Jarman had “calculated that up to 13,000 patients died needlessly”. In fact, Sir Brian had not calculated the number of “needless” deaths; rather, he had calculated the number of deaths over and above what would have been expected. Indeed, as previously noted, Sir Bruce Keogh had warned against using HSMR statistics “to quantify actual numbers of avoidable deaths”. By attributing the number of “needless” deaths to a calculation made by Sir Brian Jarman, the newspaper had failed to take care not to publish inaccurate information in breach of Clause 1 (i). As such, a correction – published promptly and with due prominence – was required in accordance with the terms of Clause 1 (ii).</p>
<p>The newspaper had offered to amend the online version of its article so that the sentence “[I]n total Sir Brian calculated that up to 13,000 patients died needlessly in that period” was replaced by “[I]n total Sir Brian calculated that up to 13,000 more patients died in that period than would have been statistically expected”. It had also offered to append the following note:</p>
<p>Clarification<br />
We have been asked to make clear that, contrary to an earlier version of this report, Sir Brian Jarman’s findings reflected the number by which mortality figures exceeded what would have been statistically expected. He made no finding as to the causes of any deaths or whether they were “needless”.</p>
<p>In addition, the newspaper had offered to publish the following correction on page two of the newspaper:</p>
<p>Clarification<br />
Following our July 14 report “13,000 died needlessly at 14 worst NHS trusts” we have been asked to make clear that Sir Brian Jarman’s findings reflected the number by which mortality figures exceeded what would have been statistically expected. He made no finding as to the causes of any deaths or whether they were “needless”.</p>
<p>The Commission noted that the complainant considered that the newspaper should also amend the article’s headline and the reference to “up to 1,200” patients dying needlessly at Stafford Hospital. He had also requested that the newspaper refrains from using the word “needless” in relation to HSMR statistics in future. However, the Commission reiterated that the newspaper had been entitled to its interpretation of the results of both the Keogh and Francis investigations. Furthermore, the first line of the piece had made clear that the 13,000 deaths related to "excess deaths” since 2005. Taken in context with the article as a whole, and in light of the additional footnote, the Commission did not consider that a significantly misleading impression of the investigation’s findings had been created by the headline. The suggested amendment and correction had addressed the key point: Sir Brian Jarman’s calculation did not reflect the outcome of a study into the causes of individual deaths. As such, the Commission was satisfied that the newspaper had offered to take sufficient action to meet its obligations under Clause 1 (ii), and it instructed the newspaper to amend the article and to publish the correction without delay in order to set the record straight.</p>
<p>The Commission then turned to consider the Daily Telegraph article, headlined “NHS inquiry: Shaming of health service as care crisis is laid bare”. In this instance, the newspaper had not given a specific number of “needless deaths”. It had said that “an investigation found that thousands of patients died needlessly because of poor care”. It had also stated that the selected hospitals had been those with the “highest recent mortality rates”. Furthermore, the newspaper had taken care to refer to the mortality statistics as “excess deaths” and it had quoted Health Secretary Jeremy Hunt as having said “no statistics are perfect but mortality rates suggest that since 2005 thousands more people may have died than would normally be expected at the 14 trusts reviewed”. In addition, the newspaper had provided anecdotal evidence of the poor care that had been identified: “some risks to patients so severe that [inspectors] were forced to step in immediately”; “decisions were taken urgently to close operating theatres, [and to] suspend unsafe ‘out of hours’ services for critically ill patients”. In the print version, this piece had also been presented alongside the findings related to the individual trusts and had clearly identified the number of “excess deaths” attributed to each one. In this instance, the Commission was satisfied that the newspaper had not given the significantly misleading impression that the Keogh investigation had examined the causes of individual deaths. The newspaper had provided adequate statistical context for its assertion regarding the numbers of “needless” deaths and therefore the basis for the newspaper’s interpretation of the relationship between mortality statistics and the level of care provided by the 14 NHS hospital trusts had been clear. As such, no correction was required and this piece did not raise a breach of the Code.</p>
</blockquote>
</div></div></div><div class="field field-name-upload field-type-file field-label-hidden"><div class="field-items"><div class="field-item even"><table class="sticky-enabled">
<thead><tr><th>Attachment</th><th>Size</th> </tr></thead>
<tbody>
<tr class="odd"><td><span class="file"><img class="file-icon" alt="" title="application/pdf" src="/modules/file/icons/application-pdf.png" /> <a href="http://understandinguncertainty.org/sites/understandinguncertainty.org/files/2013bmj-needless.pdf" type="application/pdf; length=207688">2013bmj-needless.pdf</a></span></td><td>202.82 KB</td> </tr>
</tbody>
</table>
</div></div></div>Mon, 04 Nov 2013 18:32:53 +0000david7254 at http://understandinguncertainty.orghttp://understandinguncertainty.org/press-complaints-commission-decide-13000-needless-deaths-story-was-inaccurate#commentsNew content for GCSE Maths announced
http://understandinguncertainty.org/new-content-gcse-maths-announced
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Following the consultation <a href="http://understandinguncertainty.org/probability-and-stats-gcse-maths">discussed previously on this blog</a>, the Department for Education has announced the <a href="https://www.gov.uk/government/publications/gcse-mathematics-subject-content-and-assessment-objectives">revised content for GCSE Mathematics</a>.</p>
<p>Compared to the current content, the most notable changes are (a) separation of probability and statistics, (b) removal of the data-cycle, (c) increased material.</p>
<p>The proposed content for probability is as follows:</p>
<blockquote><p>
<strong>Probability</strong> </p>
<ol><li></li>
<li>record describe and analyse the frequency of outcomes of probability experiments using tables and frequency trees </li>
<li>apply ideas of randomness, fairness and equally likely events to calculate expected outcomes of multiple future experiments </li>
<li>relate relative expected frequencies to theoretical probability, using appropriate language and the 0 - 1 probability scale </li>
<li> apply the property that the probabilities of an exhaustive set of outcomes sum to one; apply the property that the probabilities of an exhaustive set of mutually exclusive events sum to one </li>
<li> understand that empirical unbiased samples tend towards theoretical probability distributions, with increasing sample size </li>
<li> enumerate sets and combinations of sets systematically, using tables, grids, Venn diagrams and tree diagrams </li>
<li> construct theoretical possibility spaces for single and combined experiments with equally likely outcomes and use these to calculate theoretical probabilities </li>
<li> calculate the probability of independent and dependent combined events, including using tree diagrams and other representations, and know the underlying assumptions </li>
<li> calculate and interpret conditional probabilities through representation using expected frequencies with two-way tables, tree diagrams and Venn diagrams. </li>
</ol></blockquote>
<p>From my personal perspective, it's good to see reference to '<em>frequency trees</em>', '<em>expected outcomes</em>' and <em>'expected frequencies</em>', since hopefully this will encourage the teaching of probability through expected frequencies. It's a shame that two <a href="http://understandinguncertainty.org/probability-and-stats-gcse-maths">suggestions in the consultation</a> were dropped: <em>'interpret risk through assigning values to outcomes (e.g. games, insurance)</em>, and <em>calculate the expected outcome of a decision and relate to long-run average outcomes.</em> But can't have everything.</p>
<p>For statistics it's </p>
<blockquote><p>
<strong>Statistics</strong> </p>
<ol><li> </li>
<li>infer properties of populations or distributions from a sample, whilst knowing the limitations of sampling </li>
<li>interpret and construct tables, charts and diagrams, including frequency tables, bar charts, pie charts and pictograms for categorical data, vertical line charts for ungrouped discrete numerical data, tables and line graphs for time series data and know their appropriate use </li>
<li> construct and interpret diagrams for grouped discrete data and continuous data, i.e. histograms with equal and unequal class intervals and cumulative frequency graphs, and know their appropriate use</li>
<li> interpret, analyse and compare the distributions of data sets from univariate empirical distributions through:<br /><br />* appropriate graphical representation involving discrete, continuous and grouped data, including box plots </li>
<p><br />* appropriate measures of central tendency (median, mean, mode and modal class) and spread (range, including consideration of outliers, quartiles and inter-quartile range)
</p><li> apply statistics to describe a population </li>
<li> use and interpret scatter graphs of bivariate data; recognise correlation and know that it does not indicate causation; draw estimated lines of best fit; make predictions; interpolate and extrapolate apparent trends whilst knowing the dangers of so doing</li>
</ol></blockquote>
<p>Compared to the consultation, box-plots and unequal-interval histograms have gone in, and fitting a straight line has come out. </p>
</div></div></div>Sat, 02 Nov 2013 11:32:52 +0000david7253 at http://understandinguncertainty.orghttp://understandinguncertainty.org/new-content-gcse-maths-announced#commentsProbability and stats feature strongly in 'Core maths' proposals for 16-18 year olds
http://understandinguncertainty.org/probability-and-stats-feature-strongly-core-maths-proposals-16-18-year-olds
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The government is pushing ahead with proposals for a maths qualification to be taken by 16-18 year-olds who got at least a grade C in Maths GCSE but are not doing maths A level.</p>
<p><a href="https://www.gov.uk/government/news/new-maths-qualifications-to-boost-numbers-studying-maths-to-age-18">Further details</a> were released on October 8th by the Department of Education, coinciding with the release of a report by the <a href="http://www.acme-uk.org/news/news-items-repository/2013/10/expert-panel-presents-guidelines-for-new-core-mathematics-qualifications-(2)">Advisory Committee on Mathematics Education (ACME)</a> from its 'expert panel on core mathematics'.</p>
<p>This <a href="http://www.acme-uk.org/media/13699/final%2007october2013,%20expert%20panel%20on%20core%20mathematics%20report.pdf">report</a> includes the 'indicative content' contained in the table below</p>
<p><img src="/sites/understandinguncertainty.org/files/core-content.png" width="627" height="429" alt="core-content.png" /></p>
<p>The importance of probability and statistics is clear. Notable aspects include focus on rough estimates, absolute and relative risk, natural frequencies, expectations, interpreting of risk statements and critiquing quantitative evidence. In fact just what we try and cover on this site!</p>
<h3>Statement of interest</h3>
<p>I am on the advisory board of the <a href="http://mei.org.uk/files/pdf/Mathematical%20_Problem_Solving_curriculum_press_release_311012.pdf">MEI project</a> to develop a problem-solving curriculum and materials for this group of students.</p>
</div></div></div>Mon, 14 Oct 2013 14:14:23 +0000david7236 at http://understandinguncertainty.orghttp://understandinguncertainty.org/probability-and-stats-feature-strongly-core-maths-proposals-16-18-year-olds#commentsSeptember 19th is Huntrodds day!
http://understandinguncertainty.org/september-19th-huntrodds-day
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>When on holiday at Whitby we took this photo of this extraordinary memorial to Mr and Mrs Huntrodds. </p>
<p>As you can read, they were both born on September 19th 1600, married on September 19th, had 12 children and then both died within 5 hours of each other on their joint 80th birthday on September 19th 1680. Now that's an impressive 'coincidence' - if it can be considered that. After all, they presumably chose to marry, and when to marry, so the really odd thing is when they died. But was there plague in Whitby in 1680? Did they have an accident? They must have been local characters, having the same birthday and being so old.</p>
<p><img src="/sites/understandinguncertainty.org/files/huntrodds_0.jpeg" width="640" height="480" alt="huntrodds_0.jpeg" /></p>
<p>A modern equivalent is the wonderful couple Joyce and Ron Pulsford who <a href="http://www.littlehamptongazette.co.uk/news/top-stories/latest/it-s-lucky-eight-for-pagham-couple-1-1492783">were both 80 on 08.08.08</a>. But they survived their birthday.</p>
<p>Of course this brings up the old question of whether people are more likely to die on their birthday, <a href="http://www.bbc.co.uk/news/world-18626157">which I have previously queried</a>. Hugh Aldersey-Williams recently pointed out this quote to me from the "17th century physician, philosopher, writer and mythbuster Sir Thomas Browne", who in his<a href="http://ebooks.adelaide.edu.au/b/browne/thomas/friend/"> 'Letter to a Friend' said</a> :</p>
<blockquote><p>Nothing is more common with Infants than to dye on the day of their Nativity, to behold the worldly Hours and but the Fractions thereof; and even to perish before their Nativity in the hidden World of the Womb, and before their good Angel is conceived to undertake them. But in Persons who out-live many Years, and when there are no less than three hundred sixty five days to determine their Lives in every Year; that the first day should make the last, that the Tail of the Snake should return into its Mouth precisely at that time, and they should wind up upon the day of their Nativity, is indeed a remarkable Coincidence, which tho Astrology hath taken witty pains to salve, yet hath it been very wary in making Predictions of it.</p></blockquote>
<p>Note the alchemical references to the <a href="http://en.wikipedia.org/wiki/Ouroboros">Ouroboros</a>. </p>
<p>So maybe the preponderance of deaths on birthdays is simply due to registrations of babies who die soon after birth? But even though Sir Thomas thought it a 'remarkable Coincidence' if an adult did die on their birthday, this is <a href="http://en.wikipedia.org/wiki/Thomas_Browne">exactly what he did </a>on 19th October 1682, his 77th birthday. And just 2 years after the Huntrodds died.</p>
</div></div></div>Sat, 21 Sep 2013 17:01:36 +0000david7210 at http://understandinguncertainty.orghttp://understandinguncertainty.org/september-19th-huntrodds-day#commentsProbability and stats in GCSE Maths
http://understandinguncertainty.org/probability-and-stats-gcse-maths
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The current consultation on <a href="https://www.education.gov.uk/consultations/index.cfm?action=consultationDetails&consultationId=1911&external=no&menu=1">GCSE subject content and assessment objectives</a> for Mathematics GCSE features major changes for probability and statistics. </p>
<p>I encourage everyone with an interest to respond (before 20th August): here is my personal take on the topic.</p>
<p>The proposals are as follows:</p>
<blockquote><h3>Probability</h3>
<ul><li></li>
<li>record and describe the frequency of outcomes of probability experiments using tables and frequency trees </li>
<li>apply ideas of randomness, fairness and equally likely events to calculate expected outcomes of multiple future experiments </li>
<li>relate relative expected frequencies to theoretical probability, using appropriate language and the 0-1 scale </li>
<li>apply the property that the probabilities of an exhaustive set of mutually exclusive outcomes sum to one </li>
<li>enumerate sets and combinations of sets systematically, using tables, grids, tree diagrams and Venn diagrams </li>
<li>construct theoretical possibility spaces for single and combined events with equally likely and mutually exclusive outcomes and use these to calculate theoretical probabilities </li>
<li>calculate the probability of independent and dependent combined events, including tree diagrams and other representations and know the underlying assumptions </li>
<li>calculate and interpret conditional probabilities through representation using two-way tables, tree diagrams, Venn diagrams and by using the formula </li>
<li>understand that empirical samples tend towards theoretical probability distributions, with increasing sample size and with lack of bias </li>
<li>interpret risk through assigning values to outcomes (e.g. games, insurance) </li>
<li>calculate the expected outcome of a decision and relate to long-run average outcomes. </li>
</ul><h3>Statistics</h3>
<ul><li></li>
<li>apply statistics to describe a population or a large data set, inferring properties of populations or distributions from a sample, whilst knowing the limitations of sampling </li>
<li>construct and interpret appropriate charts and diagrams, including bar charts, pie charts and pictograms for categorical data, and vertical line charts for ungrouped discrete numerical data </li>
<li>construct and interpret diagrams for grouped discrete data and continuous data, i.e. histograms with equal class intervals and cumulative frequency graphs </li>
<li>interpret, analyse and compare univariate empirical distributions through:
<ul><li></li>
<li>appropriate graphical representation involving discrete, continuous and grouped data </li>
<li>appropriate measures of central tendency, spread and cumulative frequency (median, mean, range, quartiles and inter-quartile range, mode and modal class) </li>
</ul></li>
<li>describe relationships in bivariate data: sketch trend lines through scatter plots; calculate lines of best fit; make predictions; interpolate and extrapolate trends.
</li>
</ul></blockquote>
<p>In addition, proposed to provide in the formulae sheet:</p>
<blockquote><h3>Probability</h3>
<p>Where $P(A)$ is the probability of outcome $A$ and $P(B)$ is the probability of outcome $B$:<br />
$$ P (A \hbox{ or } B) = P(A )+ P(B ) - P(A \hbox{ and } B )$$<br />
$$P(A \hbox{ and } B ) = P(A \hbox{ given } B ) P(B )$$
</p></blockquote>
<p>Compared to the current curriculum (shown at the bottom of this blog), the new proposals</p>
<ul><li>Split probability and statistics</li>
<li>In probability
<ul><li>Emphasises multiple representations</li>
<li>Includes additional attention to conditional probabilities</li>
<li>Includes risk and expectation</li>
</ul></li>
<li>In statistics
<ul><li>Drops histograms with unequal intervals</li>
<li>Drops ‘data cycle’ (although mentions ‘limitations of sampling’)</li>
<li>Includes calculating line of best fit</li>
</ul></li>
</ul><p>Perhaps the most controversial element is the non-inclusion of the ‘data-cycle’ (or 'statistics cycle'), of problem analysis, data collection, data presentation, data analysis. There has been a long argument within the statistics community of whether this belongs in GCSE Mathematics: the 2004 Smith Inquiry into post-14 maths education <a href="http://www.mathsinquiry.org.uk/report/">Making Mathematics Count </a> recommended </p>
<blockquote><p>The Inquiry recommends that there be a radical re-look at<br />
this issue and that much of the teaching and learning of Statistics and<br />
Data Handling would be better removed from the mathematics timetable<br />
and integrated with the teaching and learning of other disciplines (eg<br />
biology or geography). The time restored to the mathematics timetable<br />
should be used for acquiring greater mastery of core mathematical<br />
concepts and operations.</p></blockquote>
<p>Indeed, the proposed <a href="https://www.gov.uk/government/consultations/gcse-subject-content-and-assessment-objectives">Science GCSE subject content and assessment objectives</a> now includes ..</p>
<blockquote><ul><li></li>
<li>apply the cycle of collecting, presenting and analysing data, including:</li>
<ul><li></li>
<li>present observations and data using appropriate methods</li>
<li>carry out and represent mathematical and statistical analysis</li>
<li>represent random distributions of results and estimations of uncertainty</li>
<li>interpret observations and data, including identifying patterns and trends, make inferences and draw conclusions</li>
<li>present reasoned explanations including of data in relation to hypotheses</li>
<li>evaluate data</li>
<li>use an appropriate number of significant figures in calculations</li>
</ul><li>communicate the scientific rationale for investigations, methods used, findings and reasoned conclusions through written and electronic reports and presentations.</li>
</ul></blockquote>
<p>However the Royal Statistical Society's recently-commissioned <a href="http://www.rss.org.uk/site/cms/contentCategoryView.asp?category=86">Porkess Report</a> said</p>
<blockquote><ul><li></li>
<li><strong>Recommendation 5: </strong>School and college mathematics departments should ensure they have the expertise to be the authorities on statistics within their institutions. Mathematics departments should be centres of excellence for statistics, providing guidance on correct usage and good practice.</li>
<li><strong>Recommendation 6:</strong> Under present conditions, statistics is best placed in the mathematics curriculum.</li>
</ul></blockquote>
<p>Essentially the view is that if this vital element were not in Mathematics, it will either not be taught or taught badly.</p>
<p>This is tricky. My personal view is that the ‘data cycle’ is absolutely vital, but that it is better placed within understanding of the ‘scientific method’ than within core mathematics. I feel that GCSE Mathematics should provide the tools for analysis that can be used in empirical investigations, but techniques for carrying out those experiments should not be part of the assessment criteria. Obviously there is opportunity for cross-subject activity, say with Geography or Science, featuring experimental design, data-collection, analysis, presentation and interpretation of real-world numerical evidence: it is inevitably tempting to look to a different type of qualification that took a broader cross-disciplinary perspective, but we appear stuck with the rigid subject demarcations of GCSEs.</p>
<p>At A-level the link between probability and formal statistical inference can be revealed in all its glory. And if a post-16, non-A-level maths qualification is developed, then this could also include real-world investigation into the appropriate interpretation of numerical evidence.</p>
<h3>The current specification<br /></h3>
<p>This is given by the Ofqual<br /><a href="http://www2.ofqual.gov.uk/downloads/category/192-gcse-subject-criteria">GCSE Subject Criteria for Mathematics<br /></a> </p>
<blockquote><h3>Statistics and probability<br /></h3>
<ul><li></li>
<li>understand and use statistical problem solving process/handling data cycle; </li>
<li>identify possible sources of bias; </li>
<li>design an experiment or survey; </li>
<li>design data-collection sheets, distinguishing between different types of data; </li>
<li>extract data from printed tables and lists; </li>
<li>design and use two-way tables for discrete and grouped data; </li>
<li>produce charts and diagrams for various data types; </li>
<li>calculate median, mean, range, quartiles and inter-quartile range, mode and modal class; </li>
<li>interpret a wide range of graphs and diagrams and draw conclusions; </li>
<li>look at data to find patterns and exceptions; </li>
<li>recognise correlation and draw and/or use lines of best fit by eye, understanding what these represent; </li>
<li>compare distributions and make inferences; </li>
<li>understand and use the vocabulary of probability and the probability scale; </li>
<li>understand and use estimates or measures of probability from theoretical models (including equally likely outcomes), or from relative frequency; </li>
<li>list all outcomes for single events, and for two successive events, in a systematic way and derive related probabilities; </li>
<li>identify different mutually exclusive outcomes and know that the sum of the probabilities of all these outcomes is 1; </li>
<li>know when to add or multiply two probabilities: if A and B are mutually exclusive, then the probability of A or B occurring is P(A) + P(B), whereas if A and B are independent events, the probability of A and B occurring is P(A) . P(B); </li>
<li>use tree diagrams to represent outcomes of compound events, recognising when events are independent; </li>
<li>compare experimental data and theoretical probabilities; </li>
<li>understand that if they repeat an experiment, they may – and usually will – get different outcomes, and that increasing sample size generally leads to better estimates of probability and population characteristics. </li>
</ul></blockquote>
<h3>Conflict of Interest<br /></h3>
<p>I am one of the <a href="https://media.education.gov.uk/assets/files/pdf/l/lists%20of%20commentators%20-%20final.pdf">many people consulted</a> by the Department of Education </p>
</div></div></div>Sat, 03 Aug 2013 16:23:43 +0000david7159 at http://understandinguncertainty.orghttp://understandinguncertainty.org/probability-and-stats-gcse-maths#commentsFatality risk on Boris-bikes?
http://understandinguncertainty.org/fatality-risk-boris-bikes
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>I was very saddened by the <a href="http://www.bbc.co.uk/news/uk-england-london-23207691">death on Friday of a Boris-bike rider</a> in Whitechapel High Street, particularly as I am a frequent and enthusiastic user of the scheme. But as a statistician, I also immediately wondered how surprised I should be about the fact that this was the first fatality of the bikes. My conclusion, using a very rapid and crude analysis, is that it does not suggest that Boris-bikes are of higher risk than average cycling, and if anything we have been fortunate that it has taken this long for the first fatality. Of course this does not lessen the tragedy of the event.</p>
<p><a href="http://www.tfl.gov.uk/roadusers/cycling/20389.aspx ">Transport for London</a> report that between December 2010 and 31st May 2013 there were around 22,000,000 Barclays Cycle Hire (the official name) trips in London. There were 750,000 trips in May, so let’s assume that by July 7th there were around 23,000,000 trips. These journeys were an average of 20 minutes during the week and 28 minutes at the weekend, so conservatively we could assume 1.5 miles for each trip, giving a total of at least 34,000,000 miles cycled on Boris bikes since the opening of the scheme to non-members.</p>
<p>The Department of Transport reports that in 2011 there were 22 cyclist deaths per billion km (620,000,000 miles), which works out as one cycling fatality expected every 620,000,000/22 = 28,000,000 miles [see page 234 of <a href="https://www.gov.uk/government/publications/reported-road-casualties-great-britain-annual-report-2011 ">this report</a>, eventually found through the shambolic chaos of the government statistics web-links]. Of course Boris-bike users are not average: they are probably somewhat higher risk since in London and include inexperienced tourists, compensated by being lower risk by not being very old or young, and cycling extremely heavy and slow bikes. They also rarely wear cycle helmets, but I am not getting into that <a href="http://www.bmj.com/content/346/bmj.f3817?ijkey=I5vHBog6FhaaLzX&keytype=ref ">tricky area </a>.</p>
<p>If we very crudely assume these factors cancel out and Boris bike trips are of average risk, then to have a fatal accident after 34,000,000 miles is, unfortunately, not surprising. In fact, very roughly, there is perhaps less than 30% chance that it would have taken this long. </p>
<p>So I am not very surprised to hear of this tragic accident, but do feel shocked that it happened on a so-called cycle ‘superhighway’. My personal opinion, as someone who has negotiated that particular stretch of road with some trepidation, is that far more needs to be done to make cycle-friendly and protective routes in London.</p>
</div></div></div>Sun, 07 Jul 2013 08:47:55 +0000david7130 at http://understandinguncertainty.orghttp://understandinguncertainty.org/fatality-risk-boris-bikes#commentsSpeed cameras, regression-to-the-mean, and the Daily Mail (again)
http://understandinguncertainty.org/speed-cameras-regression-mean-and-daily-mail-again
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>It was interesting to hear ‘regression-to-the-mean’ being discussed on the Today programme this morning, even if the quality of the debate wasn’t great. The issue was the effectiveness of speed cameras, which tend to get installed after a spate of accidents. Since bad luck does not last, accidents tend to fall after such a ‘blip’, and this fall is generally attributed to the speed camera, whereas it would have happened anyway: this is what is meant by ‘regression-to-the-mean’.</p>
<p>The <a href="http://www.racfoundation.org/assets/rac_foundation/content/downloadables/speed_camera_data-allsop-may2013.pdf ">report from the RAC Foundation</a> tried to deal with this by essentially ignoring the 3 years before the camera was installed, and so comparing the post-installation accidents with those more than 3 years beforehand, and simultaneously allowing for overall changes in accidents over time. Unfortunately the report is not very clearly written, more discussing how to approach and analyse the (limited) data than aiming to provide definitive results. Although they helpfully provide the equations for the models being fitted, there is no executive summary and you have to search quite hard to find the crucial number flagged up for the media: the estimated 27% reduction in accidents causing fatal or severe injuries (page 32).</p>
<p>I thought the analysis seemed quite reasonable until I noticed that on page 3 it defines a baseline year as</p>
<blockquote><p>‘more than three full years before the camera was established <em>or the year during which it was established</em>’</p></blockquote>
<p>It seems very strange to include the transitional year as a baseline – surely it could just be excluded? Later on the report says that if the start-months were January or December, the year in which the camera was installed was treated as a ‘camera’ or ‘within 3-year pre-camera’ year respectively, but I am suspicious that for the remaining 10 months this could mean that some random-high accident rates could still be included in the baseline.</p>
<p>However, what is really shocking is the grossly misleading coverage of the Daily Mail, with the headline,</p>
<blockquote><p><em><br /><a href="http://www.dailymail.co.uk/news/article-2337208/Speed-cameras-increase-risk-fatal-crashes-New-RAC-investigation-raises-doubts-usefulness.html ">Speed cameras 'increase risk of serious or fatal crashes': New RAC investigation raises doubts over their usefulness</a></em>.</p></blockquote>
<p>This is a blatant mis-representation of the report and its findings, focusing solely on the 21 cameras where an increase was estimated, and ignoring the 530 where it wasn’t, as clearly shown in the table the Daily Mail so helpfully reproduce! They should be ashamed of themselves. </p>
</div></div></div>Fri, 07 Jun 2013 10:55:08 +0000david6948 at http://understandinguncertainty.orghttp://understandinguncertainty.org/speed-cameras-regression-mean-and-daily-mail-again#commentsHow can 2% become 20%?
http://understandinguncertainty.org/how-can-2-become-20
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The <a href="http://www.dailymail.co.uk/health/article-2335397/Statins-weaken-muscles-joints-Cholesterol-drug-raises-risk-problems-20-cent.html ">Daily Mail headline</a> below is unequivocal – statins cause a 20% increase in muscle problems.</p>
<p> <img src="/sites/understandinguncertainty.org/files/statin-muscles-mail.jpg" width="651" height="372" alt="statin-muscles-mail.jpg" /></p>
<p>Unfortunately, the ‘20%’ is factually incorrect - the study on which this story is based claims that taking statins increased the risk of muscle problems from 85% to 87%. And even that claim is highly dubious. How can the Daily Mail get it so wrong? </p>
<p>The ‘20%’ is a basic statistical error promoted by a misleading <a href="http://archinte.jamanetwork.com/article.aspx?articleid=1691918 ">abstract</a> and <a href="http://media.jamanetwork.com/news-item/musculoskeletal-conditions-injuries-may-be-associated-with-statin-use/ ">press release</a> from JAMA Internal Medicine – associated with the Journal of the American Medical Association, a (supposedly) reputable source. The authors estimated an ‘odds ratio’ of 1.19 for muscular-skeletal problems, which the Daily Mail interpreted as a 20% increased risk. I’m afraid we need to get a bit technical now. An odds ratio is a standard measure that statisticians and epidemiologists (yes, them again) use to measure an association between an exposure (here statins) and an event (muscle problems). It is defined as the odds of the event given the exposure, divided by the odds without the exposure. The crucial thing is the use of odds, not risk, where odds is the probability of the event divided by the probability of the event not occurring (why statisticians should use this bizarre measure is another story – see for example this <a href="http://en.wikipedia.org/wiki/Odds_ratio">Wikipedia description</a>). </p>
<p>Table 4 of the paper (not reported in the abstract) reports risks with and without statins of 87% vs 85%, which translate to odds of 0.87/0.13 = 6.7 and 0.85/0.15 = 5.7. The odds ratio is therefore 6.7/5.7 = 1.18 (their figure of 1.19 involved some adjustment for other factors). Alternatively, the risk ratio was 0.87/0.85 = 1.02, a 2% relative change, while the difference in absolute risks was 0.87 – 0.85 = 2%. The <a href="http://www.abpi.org.uk/our-work/library/guidelines/Pages/default.aspx">Code of Practice for the British Pharmaceutical Industry</a> has banned the reporting of relative risk without also giving the change in absolute risk. Why this is still considered acceptable within epidemiological papers is beyond me. </p>
<p>And such a tiny difference, in a very common problem, could be due to all sorts of confounding factors that were not allowed for. In particular, people on statins are likely to visit their doctor more, who may then investigate other symptoms, as the authors admit in their discussion. So they have not shown this difference was due to statins.</p>
<p>It is difficult to know who is most to blame here – the authors for producing a misleading abstract without the key information, JAMA Internal Medicine, or the Daily Mail. Personally, I feel that JAMA Internal Medicine is most responsible, for not properly refereeing the paper, and producing a press release that invited misunderstanding and distortion. </p>
</div></div></div>Tue, 04 Jun 2013 13:57:16 +0000david6943 at http://understandinguncertainty.orghttp://understandinguncertainty.org/how-can-2-become-20#commentsCourt of Appeal bans Bayesian probability (and Sherlock Holmes)
http://understandinguncertainty.org/court-appeal-bans-bayesian-probability-and-sherlock-holmes
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><blockquote><p>..when you have eliminated the impossible, whatever remains, however improbable, must be the truth<br />
(Sherlock Holmes in The Sign of the Four, ch. 6, 1890)</p></blockquote>
<p>In a <a href="http://www.bailii.org/ew/cases/EWCA/Civ/2013/15.html">recent judgement </a>the English Court of Appeal has not only rejected the Sherlock Holmes doctrine shown above, but also denied that probability can be used as an expression of uncertainty for events that have either happened or not.</p>
<p>The case was a civil dispute about the cause of a fire, and concerned an appeal against a decision in the High Court by Judge Edwards-Stuart. Edwards-Stuart had essentially concluded that the fire had been started by a discarded cigarette, even though this seemed an unlikely event in itself, because the other two explanations were even more implausible. The Court of Appeal rejected this approach although still supported the overall judgement and disallowed the appeal - commentaries on this case have appeared <a href="http://www.lexology.com/library/detail.aspx?g=471d7904-20d2-4fdb-a061-8d49b10de60d&l=7HVPC65"> here </a> and <a href="http://www.12kbw.co.uk/cases/commentary/id/161/">here</a>.</p>
<p>But it's the quotations from the judgement that are so interesting:</p>
<blockquote><p>Sometimes the "balance of probability" standard is expressed mathematically as "50 + % probability", but this can carry with it a danger of pseudo-mathematics, as the argument in this case demonstrated. When judging whether a case for believing that an event was caused in a particular way is stronger that the case for not so believing, the process is not scientific (although it may obviously include evaluation of scientific evidence) and to express the probability of some event having happened in percentage terms is illusory.
</p></blockquote>
<p>The idea that you can assign probabilities to events that have already occurred, but where we are ignorant of the result, forms the basis for the Bayesian view of probability. Put very broadly, the 'classical' view of probability is in terms of genuine unpredictability about future events, popularly known as 'chance' or 'aleatory uncertainty'. The Bayesian interpretation allows probability also to be used to express our uncertainty due to our ignorance, known as 'epistemic uncertainty', and popularly expressed as betting odds. Of course there are all gradations, from pure chance (think radioactive decay) to processes assumed to be pure chance (lottery draws), to future events whose odds depend on a mixture of genuine unpredictability and ignorance of the facts (whether Oscar Pistorius will be convicted of murder), to pure epistemic uncertainty (whether Oscar Pistorius knowingly shot his girlfriend).</p>
<p>The judges went on to say:</p>
<blockquote><p>The chances of something happening in the future may be expressed in terms of percentage. Epidemiological evidence may enable doctors to say that on average smokers increase their risk of lung cancer by X%. But you cannot properly say that there is a 25 per cent chance that something has happened: Hotson v East Berkshire Health Authority [1987] AC 750. Either it has or it has not.
</p></blockquote>
<p>So according to this judgement, it would apparently not be reasonable in a court to talk about the probability of Kate and William's baby being a girl, since that is already decided as true or false (but see note added below). This seems extraordinary.</p>
<p>Part of the problem may be the judges' use of the word 'chance' to describe epistemic uncertainty about whether something has happened or not - this would be unusual usage now (even though Thomas Bayes used 'chance' in this sense). If they had used the term 'probability' perhaps their quote above would seem more clearly unreasonable. </p>
<p>Anyway, I teach the Bayesian approach to post-graduate students attending my 'Applied Bayesian Statistics' course at Cambridge, and so I must now tell them that the entire philosophy behind their course has been declared illegal in the Court of Appeal. I hope they don't mind.</p>
<p>(Note added 1st March 2013: <a href="http://sports.williamhill.com/bet/en-gb/betting/e/2586242/Name%2dof%2dWilliam%2d%26%2dKate%2ds%2dfirst%2dbaby.html">William Hill </a> are currently offering 1000-1 against <em>Chardonnay</em> as the name of the potential future monarch).</p>
</div></div></div>Mon, 25 Feb 2013 09:26:09 +0000david6817 at http://understandinguncertainty.orghttp://understandinguncertainty.org/court-appeal-bans-bayesian-probability-and-sherlock-holmes#commentsWhat's more dangerous - the bute or the burger?
http://understandinguncertainty.org/whats-more-dangerous-bute-or-burger
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>There is reasonable public outrage at possible criminal conspiracies to adulterate meat products with horsemeat, and additional concerns raised about the presence of the anti-inflammatory known as bute.</p>
<p>While not in any way questioning this concern about adulteration with a chemical compound, it is helpful to get a sense of magnitude. When bute was given as a human medicine, it was reported to be associated with a serious adverse reaction in 1 in 30,000 (over a whole course of treatment), but at a dose giving concentrations at least 4,000 times that arising from eating a diet of horse meat - see the excellent information from the <a href="http://www.sciencemediacentre.org/expert-reaction-to-the-continuing-horsemeat-story-and-bute-phenylbutazone/ ">Science Media Centre</a> </p>
<p>So making all sorts of heroic assumptions about there being a linear-no-threshold response, we might very roughly assign a pro-rata risk of a serious event as 1 in 100,000,000 per burger.</p>
<p>Compare that with the risk from the meat itself. There is good evidence that red meat consumption is associated with an<a href="http://www.nhs.uk/Livewell/Goodfood/Pages/red-meat.aspx"> increased risk of bowel cancer</a>, and specifically a <a href="http://archinte.jamanetwork.com/article.aspx?articleid=1134845">large recent study from Harvard</a> associated a daily habit of 80g (3.5 oz) of red meat with an increased all-cause mortality rate of 13% - I recently showed in this <a href="http://www.natap.org/2012/newsUpdates/bmj.e8223.pdf">British Medical Journal paper </a>that this was as if, pro-rata, each portion of red meat was associated with ½ hour loss in life-expectancy, around 1,000,000th of a young-adult’s future life.</p>
<p>So my rough guess is that for a burger made out of horse-meat containing bute - or indeed any kind of red meat - the burger itself carries around 100 times the apparent risk of the bute. Even taking into account that the bute reaction would occur quicker than any harm from the red meat, this still is a notable disparity.</p>
<p>Of course I know very well that people, including myself, feel very differently about risks that are chosen as part of daily life, and appear ‘natural’, to those imposed by outside (probably criminal) agencies and involve unnatural substances. I fully respect those feelings, but I still believe some perspective is valuable.</p>
</div></div></div>Fri, 15 Feb 2013 08:28:12 +0000david6811 at http://understandinguncertainty.orghttp://understandinguncertainty.org/whats-more-dangerous-bute-or-burger#commentsSquaring the square, in glass
http://understandinguncertainty.org/squaring-square-glass
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Here is my latest stained glass effort, seen on a snowy day. </p>
<p><img src="/sites/understandinguncertainty.org/files/trinity-glass2-small.jpg" width="600" height="450" alt="trinity-glass2-small.jpg" /></p>
<p>It is a 'square of squares', where all the constituent squares are of different sizes. Here are the dimensions - </p>
<p><img src="/sites/understandinguncertainty.org/files/sqsqbig.png" width="400" height="400" alt="sqsqbig.png" /></p>
<p>It is copied from the logo of the <a href="https://www.srcf.ucam.org/tms/about-the-tms/the-squared-square/#2">Trinity Mathematical Society</a>, who point out that it is the <em>unique smallest simple squared square (smallest in that it uses the fewest squares, and simple in that no proper subset of the squares of size at least 2 forms a rectangle).</em> It was proved to be the smallest such square by Duijvestijn in 1978, but this was by exhaustive computer search, which seems a bit like cheating.</p>
<p>There is a fine <a href="http://en.wikipedia.org/wiki/Squaring_the_square">Wikipedia site </a>which contains more than you ever wish to know about squaring-the-square.</p>
<h2>Challenge</h2>
<p>I wanted to only use 4 colours without any square touching another of the same colour, and of course I knew this is possible due to the 4-colour theorem. But I wanted the four large outer squares to be 'white' (in order to increase the Mondrian appeal). It took some effort and trial-and-error to find a 4-colouring with this property. Are there others?</p>
</div></div></div>Tue, 22 Jan 2013 18:24:28 +0000david6789 at http://understandinguncertainty.orghttp://understandinguncertainty.org/squaring-square-glass#commentsAlcohol in pregnancy and IQ of children
http://understandinguncertainty.org/alcohol-pregnancy-and-iq-children
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Some of the coverage of yesterday's story about drinking in pregnancy and IQ of children was not entirely accurate. The Times reported that <em>'women who drink even a couple of glasses of wine a week during pregnancy are risking a two-point drop in their child's IQ</em>', and '<em>children whose mothers drank between 1 and 6 units a week - up to three large glasses of wine - had IQs about two points lower</em> '(than mothers who did not drink). </p>
<p>But let's look at Table 3 of the paper, which is available <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0049407">here</a>.</p>
<p><img src="/sites/understandinguncertainty.org/files/alcohol-IQ-table3.jpg" width="700" height="305" alt="alcohol-IQ-table3.jpg" /></p>
<p>Here we see that children of mothers that drank had systematically higher IQs than those who didn't! But nothing can be inferred from this (except that the Times was not exactly correct).</p>
<p>This of course is the whole problem of carrying out these studies: there are many 'confounding' factors that are associated both with mothers' drinking habits and children's IQ, and this makes teasing out the underlying relationship very tricky.</p>
<p>This is where the ingenious idea of 'Mendelian Randomisation' comes in. Genes are assumed independent of confounding actors, so it is as if women have been randomly allocated to the genetic groups in the Table, and so they should be balanced for all other factors. The genes are seen to be associated with the IQ of offspring among women who drink, and the genes were selected as those that regulate uptake of alcohol, and babies who may be exposed to greater alcohol have on average lower IQs. But maybe the genes affect babies in other ways than in uptake of alcohol? But this is deemed implausible, as there is no relationship seen in non-drinkers. </p>
<p>This is a very clever and careful study, to be taken very seriously. But it does not allow an estimate of the effect of drinking, and the authors are careful not to give one (even if they appear to have been rather happy to declare public health implications, which seems to be somewhat overstepping their role as epidemiologists).</p>
<p>Very crudely, if we took the lowest group as similar to non-drinkers, then the effect of moderate drinking might be estimated as around 2 points, but nobody would want to put this in an academic publication.</p>
<p>In addition, an important issue is the alcohol quantities. The 'drinkers' include those reporting between 1 and 6 units a week - quite a range - and we can also assume that this is an understatement of true consumption. So what 'moderate' drinking actually means is open to some question.</p>
<p>As usual, the NHS Behind the Headlines site had an<a href="http://www.nhs.uk/news/2012/11November/Pages/Just-one-glass-of-wine-a-week-in-pregnancy-damages-childs-IQ.aspx"> excellent discussion.</a></p>
<p><em>Added later: I have made a few edits in this blog to make it clearer that I am not questioning the study's basic conclusions, just pointing out that some of the coverage misunderstood the (somewhat subtle) findings</em></p>
</div></div></div>Fri, 16 Nov 2012 17:54:04 +0000david6681 at http://understandinguncertainty.orghttp://understandinguncertainty.org/alcohol-pregnancy-and-iq-children#commentsMore lessons from L'Aquila
http://understandinguncertainty.org/more-lessons-laquila
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>The L’Aquila story gets even murkier.</p>
<h2>Scientists duty</h2>
<p>Additional reports suggest complicity to manipulate public opinion. See, for example, this <a href="http://www.repubblica.it/cronaca/2012/10/25/news/terremoto_aquila_intercettazioni-45259736/?ref=HRER1-1">article in La Repubblica</a> (in Italian) with the headline quote '<em>the truth cannot be said</em>' taken from a tapped telephone call between the head of the Civil Protection Agency and one of the scientists. The article claims that the misleading statement on which the trial hinged - that the many small shocks reduced rather than increased the risk - had already been decided by officials and the scientists were simply part of a 'media operation'. </p>
<p>So, although it sounds a bit obvious, an extra lesson I draw is</p>
<ul><li>Scientific advisors owe a duty to society as a whole, must retain their independence, and should carefully avoid ‘going native’ and becoming complicit in the objectives of the agency that has requested their services.
</li>
</ul><h2>Communicating the chances of low-probability high-impact events<br /></h2>
<p>The trial rests largely on the claim in the press conference that the swarm of small shocks reduced the risks of a large earthquake. Assuming this is not the case, and the risk was in fact increased over the normal levels, it raises the vital issue of communicating low-absolute risk but high relative risk probabilities. The general literature on risk communication advises against the sole use of relative risk, since these are known to give an exaggerated impression of magnitude (twice ‘very small’ is still generally ‘very small’). Thomas Jordan, director of the Southern California Earthquake Center at the University for Southern California, chairs the International Commission on Earthquake Forecasting (ICEF), which wrote <a href="http://www.annalsofgeophysics.eu/index.php/annals/article/view/5350">a report </a> following the L’Aquila quake. He argues strongly that a time-series of probabilities should be provided in public communication – you can listen to Jordan being interviewed by <a href="http://www.radio3.rai.it/dl/radio3/programmi/puntata/ContentItem-e07a4299-9679-48ea-831b-5d460bc43f79.html">Italian public radio</a> (RAI) and the <a href="http://www.bbc.co.uk/programmes/p00zc0d5"> BBC World Service</a>. So my next lesson is</p>
<ul><li>When communicating the chances of low-probability high-impact events, provide estimates of absolute risks. However these need to be put in context, preferably by relating to levels of risk at other times.</li>
</ul><p>People deserve to know that the risk has increased, even if it is still low in an absolute sense (as it always will be for earthquakes), so that they can apply their own thresholds for caution.</p>
<h2>Indemnity</h2>
<p>In my previous <a href="http://understandinguncertainty.org/continuing-tragedy-l’aquila">blog</a>, I mentioned about the need to acquire indemnity against civil actions. The <a href="http://www.bis.gov.uk/assets/goscience/docs/c/11-1382-code-of-practice-scientific-advisory-committees.pdf">UK Government Chief Scientific Advisor's Code of Practice for Scientific Advice</a> says this should be available to scientific advisors: under "<em>Liabilities and indemnity of members</em>" it says</p>
<blockquote><p>"The Cabinet Office Model Code of Practice for Board Members of Advisory Non-Departmental Public Bodies (page 6) states that: “Legal proceedings by a third party against individual board members of advisory bodies are very exceptional. A board member may be personally liable if he or she makes a fraudulent or negligent statement which results in a loss to a third party; or may commit a breach of confidence under common law or criminal offence under insider dealing legislation, if he or she misuses information gained through their position. However, the Government has indicated that individual board members who have acted honestly, reasonably, in good faith and without negligence will not have to meet out of their own personal resources any personal civil liability which is incurred in execution or purported execution of their board functions. Board members who need further advice should consult the sponsor department.”</p>
<p>This should already be the position for existing advisory NDPBs. For newly established committees and for non-NDPBs, secretariats should liaise with their sponsoring department’s Public Bodies Team or Human Resources Team to ensure that an appropriate indemnity for members is in place."</p></blockquote>
<p>I interpret this is saying that, for a broad range of advisory committees, sponsoring departments should ensure an appropriate indemnity scheme is in place. The code applies very widely:</p>
<blockquote><p>"The Code was developed to apply to advisory committees providing independent scientific advice, regardless of their specific structure and lines of accountability; whether reporting to a Ministerial Department, Non-Ministerial Department or other public body, and whether an advisory NDPB or an expert scientific committee."</p></blockquote>
<h2>Twitter</h2>
<p>I also warned of the dangers of using social media in delicate situations. Subsequently a rather casual tweet of mine found its way onto a <a href="http://edition.cnn.com/2012/10/23/world/europe/italy-quake-scientists-guilty/index.html">CNN News report</a> and into Italian national media, bringing critical comments from Italian colleagues. I should listen to my own advice.</p>
</div></div></div>Sun, 28 Oct 2012 19:18:45 +0000david6653 at http://understandinguncertainty.orghttp://understandinguncertainty.org/more-lessons-laquila#commentsThe Continuing Tragedy of L’Aquila
http://understandinguncertainty.org/continuing-tragedy-l%E2%80%99aquila
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p> As in ‘<a href="http://www.thesun.co.uk/sol/homepage/irishsun/irishsunnews/4603979/Boffins-jailed-for-not-predicting-killer-earthquake.html ">Boffins jailed for not predicting earthquake</a>’, the 6-year sentences and massive fines handed out to the Italian seismologists have been largely portrayed by the media and commentators outside Italy as an attack on science, and the prosecution ridiculed as expecting the scientists to have been able to predict the earthquake. </p>
<p>However, many have pointed out that it is all a bit more complicated than that. See, for example, <a href="http://www.nature.com/news/2011/110914/full/477264a.html ">a detailed article in Nature</a> and these blogs by <a href="http://rogerpielkejr.blogspot.co.uk/2012/10/mischaracterizations-of-laquila-lawsuit.html?utm_source=dlvr.it&utm_medium=twitter ">Roger Pielke</a> and <a href="http://tremblingearth.wordpress.com/2012/10/23/conviction-of-italian-seismologists-a-nuanced-warning/ ">Austin Elliott</a>. </p>
<p>Briefly, the seismologists appear to have agreed to attend a hasty meeting that had, possibly unknown to them, been set up by local officials with the express intention of playing down local fears. The scientists concluded that ‘they could not be confident there would be an earthquake’, which was subsequently communicated in an informal press conference as ‘confident there would not be an earthquake’, which in the eyes of some locals rendered them culpable after the subsequent events. Essentially, the seismologists appear to have been manipulated by local interests, and are now paying a ludicrous price.</p>
<p>In spite of Sir John Beddington (Government Chief Scientific Advisor) assuring us that this type of prosecution would not happen in the UK, this should be a strong warning to any scientist asked for their opinion about matters of strong public interest, as Willy Aspinall lays out in this excellent <a href="http://www.nature.com/news/2011/110914/pdf/477251a.pdf ">commentary in Nature</a>. </p>
<p>The lessons I am personally trying to learn are - </p>
<p>1. Never to give advice unless I am confident that the findings will be communicated either by myself or a trusted professional source, using a pre-determined plan and appropriate, carefully chosen language that acknowledges uncertainty and does not either prematurely reassure or induce unreasonable concern.</p>
<p>2. Not to engage in informal communication using social media on that issue.</p>
<p>3. Ensure proper indemnity arrangements are in place. Apparently this is true for official government advisors, but in my experience I have found that establishing advisors' legal position was not a high priority for the people asking for advice. And indemnity could not be taken for granted when advising agencies such as NHS Trusts (not being an NHS employee). Of course, even in the UK one would not be covered for criminal prosecutions such as the one on Italy.</p>
<p>The earthquake threat will always be there in many parts of Italy, and this court case has only added to the woes of the Italian public by distracting attention from lax building standards. And who in Italy will want to choose seismology as a career now?</p>
<p><em>Added as an afterthought<br /></em><br />
There is, of course, a danger of 'defensive science', and an unwillingness to engage with important public issues. But I believe the lessons listed above should be standard professional practice, and do not represent an over-cautious approach. It would be an extra tragedy if L'Aquila led to a general reluctance to provide scientific advice.</p>
</div></div></div>Wed, 24 Oct 2012 07:42:50 +0000david6636 at http://understandinguncertainty.orghttp://understandinguncertainty.org/continuing-tragedy-l%E2%80%99aquila#commentsRats and GM
http://understandinguncertainty.org/rats-and-gm
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>With others, I made some <a href="http://www.sciencemediacentre.org/pages/press_releases/12-09-19_gm_maize_rats_tumours.htm">comments for the press </a>about the recent paper (abstract, figures and tables freely available <a href="http://www.sciencedirect.com/science/article/pii/S0278691512005637 ">here</a>) on cancer in rats fed GM maize and Monsanto's Roundup pesticide.<br />
[ Full paper should also be available <a href="http://research.sustainablefoodtrust.org/wp-content/uploads/2012/09/Final-Paper.pdf">here</a>].</p>
<p>Whatever the truth about GMOs, this is not a great contribution to the debate. The paper is not well written, to say the least, with phrases such as “In females, all treated groups died 2–3 times more than controls, and more rapidly” in the abstract. The Methods section gives a whole lot of detail about some complex secondary method, but nothing on the analysis of the primary outcome data, presumably tumour incidence over time. </p>
<p>If we assume the experiment was carried out appropriately, the crucial flaw was only having 20 control rats, 10 in each group, so that it is (predictably) almost impossible to show statistically significant differences, since the control rats would have been expected to develop tumours too. In fact no formal statistical tests are carried out, and one does not have to do much maths to understand that statements about ‘30% of male control rats’ actually mean ‘3 out of 10’.</p>
<p>If you can’t download the full version, the figures and tables are available, so you can see the “survival” plots with no labeling of the curves or statistical comparison. Figure 1 actually shows that the highest dose male rats seem to have done even better than the controls, but then this difference would not be statistically significant either. The gruesome pictures only show treated rats, but the majority of the 20 control rats got tumours too, as apparently this strain is particularly prone to them.</p>
<p>The <a href="http://www.dailymail.co.uk/sciencetech/article-2205509/Cancer-row-GM-foods-French-study-claims-did-THIS-rats--cause-organ-damage-early-death-humans.html?ITO=1490 ">Daily Mail’s coverage</a> was what you would expect given their old stand on 'Franken-foods', misleadingly quoting Michael Antoniou as if he were independent when he was part of the campaigning organisation CRIIGEN (established by the lead author Seralini) that ran the trials and even helped to write the paper. They also claim the paper was “peer reviewed by independent scientists to guarantee the experiments were properly conducted and the results are valid”, when in this case it is clear that this never went near a decent statistical reviewer. But this is hardly the Daily Mail's fault.</p>
<p>I am grateful for the authors for publishing this paper, as it provides a fine case study for teaching a statistics class about poor design, analysis and reporting. I shall start using it immediately.</p>
</div></div></div>Thu, 20 Sep 2012 06:55:13 +0000david6488 at http://understandinguncertainty.orghttp://understandinguncertainty.org/rats-and-gm#comments10 best practice guidelines for reporting science & health stories
http://understandinguncertainty.org/10-best-practice-guidelines-reporting-science-health-stories
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>These <a href="http://www.levesoninquiry.org.uk/wp-content/uploads/2012/07/Second-submission-to-Inquiry-from-Guidelines-for-science-and-health-reporting-31.05.12.pdf ">guidelines</a> were submitted by the <a href="http://www.sciencemediacentre.org/pages/">Science Media Centre</a> to the <a href="http://www.levesoninquiry.org.uk/">Leveson Inquiry</a> into the press. </p>
<p>They were produced as a consequence of Fiona Fox's appearance before the Inquiry - <a href="http://www.levesoninquiry.org.uk/evidence/?witness=fiona-fox">her submission and transcripts are here</a>. Fiona told Lord Leveson <em>"We have some fantastic science journalists in this country and I believe that if you put them in a room with very eminent scientists and members of the public that it would take them a couple of hours to come up with these basic guidelines for science coverage."</em> Lord Leveson took her at her word, and asked for the guidelines, and here they are.</p>
<blockquote><p>The following guidelines, drawn up in consultation with scientists, science reporters, editors and sub editors, are intended for use by newsrooms to ensure that the reporting of science and health stories is balanced and accurate, They are not intended as a prescriptive checklist and of course shorter articles or NIBs will not be able to cover every point, Above and beyond specific guidelines, familiarity with the technicafities and common pitfalls in science and health reporting is invaluable and every newsroom should aim to employ specialist science and health correspondents, Wherever possible the advice and skills of these specialists should be sought and respected on major, relevant stories; the guidelines below will be especially useful for editors and general reporters who are less familiar with how science works,</p>
<ul><li>State the source of the story - e.g. interview, conference, journal article, a survey from a charity or trade body, etc. - ideally with enough information for readers to look it up or a web link.
</li>
<li>
Specify the size and nature of the study - e.g. who/what were the subjects, how long did it last, what was tested or was it an observation? If space, mention the major limitations.
</li>
<li>
When reporting a link between two things, indicate whether or not there is evidence that one causes the other.
</li>
<li>
Give a sense of the stage of the research - e.g. cells in a laboratory or trials in humans - and a realistic time-frame for any new treatment or technology.
</li>
<li>
On health risks, include the absolute risk whenever it is available in the press release or the research paper - i.e. if ’cupcakes double cancer risk’ state the<br />
outright risk of that cancer, with and without cupcakes.
</li>
<li>
Especially on a story with public health implications, try to frame a new finding in the context of other evidence - e.g. does it reinforce or conflict with previous studies? If it attracts serious scientific concerns, they should not be ignored.
</li>
<li>
If space, quote both the researchers themselves and external sources with appropriate expertise. Be wary of scientists and press releases over-claiming for studies.
</li>
<li>
Distinguish between findings and interpretation or extrapolation; don’t suggest health advice if none has been offered.
</li>
<li>
Remember patients" don’t call something a ’cure’ that is not a cure.
</li>
<li>
Headlines should not mislead the reader about a story’s contents and quotation marks should not be used to dress up overstatement.
</li>
</ul></blockquote>
<p> I think they are rather good (but I would say that, wouldn't I, as I was consulted on the first draft). It will be interesting to see whether they are eventually endorsed by Leveson, or whether Editors voluntarily sign up to them.</p>
<p>Incidentally, Fiona clearly spoke rapidly. There were numerous requests to her to slow down and Lord Leveson himself observed <em>"I'm just concerned that smoke seems to be emanating from the shorthand writer."</em></p>
</div></div></div>Wed, 18 Jul 2012 14:59:05 +0000david6448 at http://understandinguncertainty.orghttp://understandinguncertainty.org/10-best-practice-guidelines-reporting-science-health-stories#commentsExplaining 5-sigma for the Higgs: how well did they do?
http://understandinguncertainty.org/explaining-5-sigma-higgs-how-well-did-they-do
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Warning, this is for statistical pedants only.</p>
<p>To recap, the results on the Higgs are communicated in terms of the numbers of sigmas, which has been calculated by the teams from what is generally (outside the world of CERN) termed a P-value: the probability of observing such an extreme result, were there not really anything going on. 5-sigmas corresponds to around a 1 in 3,500,000 chance. This tiny probability is applied to the data, but the common misinterpretation is to apply it to the explanation, and to say that there is only 1 in 3,500,000 probability that the results were just a statistical fluke, or some similar phrase. This distinction may seem pedantic, but as covered in numerous articles and blogs (see for example <a href="http://blog.carlislerainey.com/2012/07/07/innumeracy-and-higgs-boson/ ">Carlisle Rainey</a>), it is important. </p>
<p>The reports from the CERN teams were very clear: <a href="http://cms.web.cern.ch/news/observation-new-particle-mass-125-gev ">the CMS team said </a></p>
<blockquote><p>“CMS observes an excess of events at a mass of approximately 125 GeV with a statistical significance of five standard deviations (5 sigma) above background expectations. The probability of the background alone fluctuating up by this amount or more is about one in three million.”
</p></blockquote>
<p>while <a href="http://www.atlas.ch/news/2012/latest-results-from-higgs-search.html ">the ATLAS group reported</a></p>
<blockquote><p>“A statistical combination of these channels and others puts the significance of the signal at 5 sigma, meaning that only one experiment in three million would see an apparent signal this strong in a universe without a Higgs.”
</p></blockquote>
<p>However the <a href="http://press.web.cern.ch/press/PressReleases/Releases2012/PR17.12E.html ">CERN Press release </a>does not give any help with the interpretation, and just says</p>
<blockquote><p>“We observe in our data clear signs of a new particle, at the level of 5 sigma, in the mass region around 126 GeV."
</p></blockquote>
<p>How did everyone else do?</p>
<p>The BBC did very well. Tom Feilden got it dead right on the Today programme, and on the <a href="http://www.bbc.co.uk/news/world-18702455 ">BBC website </a>Paul Rincon said </p>
<blockquote><p>“They claimed that by combining two data sets, they had attained a confidence level just at the "five-sigma" point - about a one-in-3.5 million chance that the signal they see would appear if there were no Higgs particle.”</p></blockquote>
<p>In the explanation they say</p>
<blockquote><p>“The number of sigmas measures how unlikely it is to get a certain experimental result as a matter of chance rather than due to a real effect”
</p></blockquote>
<p>which is ambiguous, but would be improved by a comma after 'result'.</p>
<p>The <a href="http://online.wsj.com/article/SB10001424052702303962304577509213491189098.html?mod=WSJ_article_comments#articleTabs%3Darticle ">Numbers Guy (Carl Blalik) in the Wall Street Journal</a> provides a nice explanation of the issue, saying of the '1 in 3.5 million chance'</p>
<blockquote><p>That is not the probability that the Higgs boson doesn't exist. It is, rather, the inverse: If the particle doesn't exist, one in 3.5 million is the chance an experiment just like the one announced this week would nevertheless come up with a result appearing to confirm it does exist.
</p></blockquote>
<p>although the additional statement is not so good:</p>
<blockquote><p>In other words, one in 3.5 million is the likelihood of finding a false positive—a fluke produced by random statistical fluctuation
</p></blockquote>
<p>which puts the probability on the explanation ('fluke') rather than the data.</p>
<p>As far as I can see, every other news source gets the interpretation wrong - see also examples in <a href="http://blog.carlislerainey.com/2012/07/07/innumeracy-and-higgs-boson/ ">Carlisle Rainey's</a> blog. The <a href="http://www.nytimes.com/2012/07/05/science/cern-physicists-may-have-discovered-higgs-boson-particle.html?pagewanted=all ">New York Times</a></p>
<blockquote><p>Both groups said that the likelihood that their signal was a result of a chance fluctuation was less than one chance in 3.5 million, “five sigma,” which is the gold standard in physics for a discovery.
</p></blockquote>
<p>The <a href="http://www.telegraph.co.uk/science/large-hadron-collider/9371873/Cern-announcement-after-50-years-the-Higgs-hunt-could-be-over.html ">Daily Telegraph</a> reported</p>
<blockquote><p>Dr James Gillies, Cern’s communications director, says that talk of a discovery is “premature” and that any event would need to reach the “five sigma” level, an expression of statistical significance used by physicists, meaning it is 99.99997 per cent likely to be genuine rather than a fluke.
</p></blockquote>
<p>which I hope was not a quote from Cern’s communications director.</p>
<p>The <a href="http://www.independent.co.uk/news/science/eureka-cern-announces-discovery-of-higgs-boson-god-particle-7907677.html ">Independent</a> was typical</p>
<blockquote><p>meaning that there is less than a one in a million chance that their results are a statistical fluke.
</p></blockquote>
<p>but I expected better from <a href="http://www.newscientist.com/article/dn22014-celebrations-as-higgs-boson-is-finally-discovered.html ">New Scientist</a>, with their</p>
<blockquote><p>There's a 5-in-10 million chance that this is a fluke.
</p></blockquote>
<p><a href="http://www.livescience.com/21395-higgs-god-particle-lhc-numbers.html ">Live Science</a> had</p>
<blockquote><p>The level of significance called sigma found for the new particle in the ATLAS experiment. A 5 sigma means there is only a 1 in 3.5 million chance the signal isn't real.
</p></blockquote>
<p>while <a href="http://www.forbes.com/sites/netapp/2012/07/05/discovery-of-the-higgs-boson-particle-is-victory-for-international-scientific-cooperation/ ">Forbes Magazine</a> reported</p>
<blockquote><p>The chances are less than 1 in a million that it’s not the Higgs boson.
</p></blockquote>
<p>The BBC has shown it is not too tricky to get it right: it is a shame that people don't seem to care.</p>
</div></div></div>Sun, 08 Jul 2012 13:17:50 +0000david6440 at http://understandinguncertainty.orghttp://understandinguncertainty.org/explaining-5-sigma-higgs-how-well-did-they-do#comments Higgs: is it one-sided or two-sided?
http://understandinguncertainty.org/higgs-it-one-sided-or-two-sided
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Announcements about the Higgs Boson are invariably framed in terms of the number of sigmas, with 5-sigmas needed for a ‘discovery’. Media outlets helpfully explain what this means by translating 5-sigmas to a probability, which is almost invariably misreported as a probability of the hypothesis that it is all just statistical error e.g. <em>“meaning that it has just a 0.00006% probability that the result is due to chance”</em> <a href="http://www.nature.com/news/physicists-find-new-particle-but-is-it-the-higgs-1.10932 ">[Nature]</a> (see bottom of this blog for comments about the misinterpretation).</p>
<p>But the <a href="http://www.telegraph.co.uk/science/large-hadron-collider/9371873/Cern-announcement-after-50-years-the-Higgs-hunt-could-be-over.html ">Daily Telegraph </a>says that 5 sigma is equivalent "<em>to meaning it is 99.99997 per cent likely to be genuine rather than a fluke</em>" - this is a P-value of 0.00003%. So is 5-sigmas equivalent to 0.00003% ( 1 in 3,500,000) or 0.00006% (1 in 1,750,000)? </p>
<p>This reflects whether one is quoting a probability of a Normal observation being more than 5-sigmas away from the expected value in the direction of interest (one-sided), or either direction (two-sided). The two-sided P-value is twice the one-sided, and therefore looks less interesting. The Telegraph is using a one-sided, Nature uses two-sided, who is right? </p>
<p>It’s best to go back to a paper from CERN, eg the <a href=" http://cdsweb.cern.ch/record/1421964/files/science.pdf?version=1">ATLAS team announcing their previous results</a>. There they say that <em>"The significance of an excess is quantified by the probability (p0) that a background-only experiment is more signal-like than that observed."</em><br />
which is excellent and clear. The global P-value is calculated through a sophisticated method that allows for the multiple tests that have been done (the 'look-elsewhere' effect), and the sigma interpretation given afterwards using the graphs in Fig 3 of the paper. The translation is clearly equivalent to a one-sided test - for example they quote 1.4% as being equivalent to 2.2 sigma. And so Nature is wrong: 5-sigmas should be interpreted as a 1 in 3,500,000 chance that such results would happen, if it were all just a statistical fluke.</p>
<p>This is all rather bizarre: the correct (2-sided) P-value is calculated by the scientists, which they translate into sigmas (using a 1-sided interpretation), but then the sigma is then translated back by journalists to a P-value, often wrongly.</p>
<h3>What is a P-value anyway?</h3>
<p>As discussed <a href="http://understandinguncertainty.org/why-it’s-important-be-pedantic-about-sigmas-and-commas">previously</a>, the P-values are almost invariably interpreted incorrectly. The probability, or P-value, refers to the probability of getting such an extreme result, were there really nothing special going on. The probability should be applied to the data, not the hypothesis. This may seem pedantic, but people have been convicted of murder (Sally Clark) because of this mistake being made in court. This <a href="http://www.science20.com/quantum_diaries_survivor/fundamental_glossary_higgs_broadcast-85365 ">quantumy blog </a>gets it right and has got more explanation. The <a href="http://www.bbc.co.uk/news/science-environment-17269647 ">BBC website </a>now has a reasonably good, if slightly ambiguous, definition </p>
<blockquote><p>“The number of sigmas measures how unlikely it is to get a certain experimental result as a matter of chance rather than due to a real effect”</p></blockquote>
<p> but would be much much better if there were a comma after the word ‘result’. </p>
</div></div></div>Tue, 03 Jul 2012 18:08:38 +0000david6437 at http://understandinguncertainty.orghttp://understandinguncertainty.org/higgs-it-one-sided-or-two-sided#comments