Probability and stats in GCSE Maths

As of the 23rd May 2022 this website is archived and will receive no further updates.

understandinguncertainty.org was produced by the Winton programme for the public understanding of risk based in the Statistical Laboratory in the University of Cambridge. The aim was to help improve the way that uncertainty and risk are discussed in society, and show how probability and statistics can be both useful and entertaining.

Many of the animations were produced using Flash and will no longer work.

The current consultation on GCSE subject content and assessment objectives for Mathematics GCSE features major changes for probability and statistics.

I encourage everyone with an interest to respond (before 20th August): here is my personal take on the topic.

The proposals are as follows:

Probability

  • record and describe the frequency of outcomes of probability experiments using tables and frequency trees
  • apply ideas of randomness, fairness and equally likely events to calculate expected outcomes of multiple future experiments
  • relate relative expected frequencies to theoretical probability, using appropriate language and the 0-1 scale
  • apply the property that the probabilities of an exhaustive set of mutually exclusive outcomes sum to one
  • enumerate sets and combinations of sets systematically, using tables, grids, tree diagrams and Venn diagrams
  • construct theoretical possibility spaces for single and combined events with equally likely and mutually exclusive outcomes and use these to calculate theoretical probabilities
  • calculate the probability of independent and dependent combined events, including tree diagrams and other representations and know the underlying assumptions
  • calculate and interpret conditional probabilities through representation using two-way tables, tree diagrams, Venn diagrams and by using the formula
  • understand that empirical samples tend towards theoretical probability distributions, with increasing sample size and with lack of bias
  • interpret risk through assigning values to outcomes (e.g. games, insurance)
  • calculate the expected outcome of a decision and relate to long-run average outcomes.

Statistics

  • apply statistics to describe a population or a large data set, inferring properties of populations or distributions from a sample, whilst knowing the limitations of sampling
  • construct and interpret appropriate charts and diagrams, including bar charts, pie charts and pictograms for categorical data, and vertical line charts for ungrouped discrete numerical data
  • construct and interpret diagrams for grouped discrete data and continuous data, i.e. histograms with equal class intervals and cumulative frequency graphs
  • interpret, analyse and compare univariate empirical distributions through:
    • appropriate graphical representation involving discrete, continuous and grouped data
    • appropriate measures of central tendency, spread and cumulative frequency (median, mean, range, quartiles and inter-quartile range, mode and modal class)
  • describe relationships in bivariate data: sketch trend lines through scatter plots; calculate lines of best fit; make predictions; interpolate and extrapolate trends.

In addition, proposed to provide in the formulae sheet:

Probability

Where $P(A)$ is the probability of outcome $A$ and $P(B)$ is the probability of outcome $B$:
$$ P (A \hbox{ or } B) = P(A )+ P(B ) - P(A \hbox{ and } B )$$
$$P(A \hbox{ and } B ) = P(A \hbox{ given } B ) P(B )$$

Compared to the current curriculum (shown at the bottom of this blog), the new proposals

  • Split probability and statistics
  • In probability
    • Emphasises multiple representations
    • Includes additional attention to conditional probabilities
    • Includes risk and expectation
  • In statistics
    • Drops histograms with unequal intervals
    • Drops ‘data cycle’ (although mentions ‘limitations of sampling’)
    • Includes calculating line of best fit

Perhaps the most controversial element is the non-inclusion of the ‘data-cycle’ (or 'statistics cycle'), of problem analysis, data collection, data presentation, data analysis. There has been a long argument within the statistics community of whether this belongs in GCSE Mathematics: the 2004 Smith Inquiry into post-14 maths education Making Mathematics Count recommended

The Inquiry recommends that there be a radical re-look at
this issue and that much of the teaching and learning of Statistics and
Data Handling would be better removed from the mathematics timetable
and integrated with the teaching and learning of other disciplines (eg
biology or geography). The time restored to the mathematics timetable
should be used for acquiring greater mastery of core mathematical
concepts and operations.

Indeed, the proposed Science GCSE subject content and assessment objectives now includes ..

  • apply the cycle of collecting, presenting and analysing data, including:
    • present observations and data using appropriate methods
    • carry out and represent mathematical and statistical analysis
    • represent random distributions of results and estimations of uncertainty
    • interpret observations and data, including identifying patterns and trends, make inferences and draw conclusions
    • present reasoned explanations including of data in relation to hypotheses
    • evaluate data
    • use an appropriate number of significant figures in calculations
  • communicate the scientific rationale for investigations, methods used, findings and reasoned conclusions through written and electronic reports and presentations.

However the Royal Statistical Society's recently-commissioned Porkess Report said

  • Recommendation 5: School and college mathematics departments should ensure they have the expertise to be the authorities on statistics within their institutions. Mathematics departments should be centres of excellence for statistics, providing guidance on correct usage and good practice.
  • Recommendation 6: Under present conditions, statistics is best placed in the mathematics curriculum.

Essentially the view is that if this vital element were not in Mathematics, it will either not be taught or taught badly.

This is tricky. My personal view is that the ‘data cycle’ is absolutely vital, but that it is better placed within understanding of the ‘scientific method’ than within core mathematics. I feel that GCSE Mathematics should provide the tools for analysis that can be used in empirical investigations, but techniques for carrying out those experiments should not be part of the assessment criteria. Obviously there is opportunity for cross-subject activity, say with Geography or Science, featuring experimental design, data-collection, analysis, presentation and interpretation of real-world numerical evidence: it is inevitably tempting to look to a different type of qualification that took a broader cross-disciplinary perspective, but we appear stuck with the rigid subject demarcations of GCSEs.

At A-level the link between probability and formal statistical inference can be revealed in all its glory. And if a post-16, non-A-level maths qualification is developed, then this could also include real-world investigation into the appropriate interpretation of numerical evidence.

The current specification

This is given by the Ofqual
GCSE Subject Criteria for Mathematics

Statistics and probability

  • understand and use statistical problem solving process/handling data cycle;
  • identify possible sources of bias;
  • design an experiment or survey;
  • design data-collection sheets, distinguishing between different types of data;
  • extract data from printed tables and lists;
  • design and use two-way tables for discrete and grouped data;
  • produce charts and diagrams for various data types;
  • calculate median, mean, range, quartiles and inter-quartile range, mode and modal class;
  • interpret a wide range of graphs and diagrams and draw conclusions;
  • look at data to find patterns and exceptions;
  • recognise correlation and draw and/or use lines of best fit by eye, understanding what these represent;
  • compare distributions and make inferences;
  • understand and use the vocabulary of probability and the probability scale;
  • understand and use estimates or measures of probability from theoretical models (including equally likely outcomes), or from relative frequency;
  • list all outcomes for single events, and for two successive events, in a systematic way and derive related probabilities;
  • identify different mutually exclusive outcomes and know that the sum of the probabilities of all these outcomes is 1;
  • know when to add or multiply two probabilities: if A and B are mutually exclusive, then the probability of A or B occurring is P(A) + P(B), whereas if A and B are independent events, the probability of A and B occurring is P(A) . P(B);
  • use tree diagrams to represent outcomes of compound events, recognising when events are independent;
  • compare experimental data and theoretical probabilities;
  • understand that if they repeat an experiment, they may – and usually will – get different outcomes, and that increasing sample size generally leads to better estimates of probability and population characteristics.

Conflict of Interest

I am one of the many people consulted by the Department of Education