Monkeys and Shakespeare

As of the 23rd May 2022 this website is archived and will receive no further updates. was produced by the Winton programme for the public understanding of risk based in the Statistical Laboratory in the University of Cambridge. The aim was to help improve the way that uncertainty and risk are discussed in society, and show how probability and statistics can be both useful and entertaining.

Many of the animations were produced using Flash and will no longer work.

I was lucky enough to get included in the Horizon programme on Infinity last night, talking about the old monkey-Shakespeare issue. Of course most of my rambling contribution was (rightly) cut, so here’s a few background details for anyone interested.

  • We assumed the monkey was typing lower-case letters and some punctuation, and that each of $34$ characters was equally likely to be picked. So each character typed has a $1$ in $34$ chance of being the right one. There are $5,000,000$ characters in Shakespeare, and so the chance that they are all right, from beginning to end, is $1$ in $34^{5,000,000} = 10^{7,600,000}$, which we shall $t$ for ‘tiny’. So each time the monkey starts typing, there is a $t$ chance of completing Shakespeare. This number is small but finite.

  • We are ‘certain’ of this event occurring in exactly the same sense that we are ‘certain’ that if we repeatedly flip a fair coin, it will eventually come up heads. This will take on average 2 flips, just as the time to get the first six when throwing a die is expected to be 6 throws. So we expect to see Shakespeare after an average of $1/t$ tries by the monkey, which of course will take a long time, in fact well after the universe has come to an end. The probability that this event has not occurred by a time $T$ decreases to 0 as $T$ increases to infinity: this follows from the Law of Large Numbers : the Strong version of that Law means that the event will happen with probability 1, in that the only way that the monkey does not end up typing Shakespeare is if something with probability 0 occurs.
  • The chance of winning the lottery is around 1 in $14,000,000 = 10^{7.1}$. So the tiny chance $t$ is equivalent to winning around 1,000,000 lotteries in a row, since $10^{7,600,000}$ is around $10^{7.1}$ to the power 1,000,000. If you bought one lottery ticket a week, this is like winning every week for 20,000 years (I said 27,000 on the programme due to some over-rapid back-of-envelope calculations). Of course in real life people might start getting suspicious.
  • To get 17 characters right, eg "to_be_or_not_to_b", would require an event with chance 1 in $34^{17}$ = $2 \times 10^{19}$. Our program types characters at 50 a second, so we would expect to wait around $4 \times 10^{17}$ seconds before this occurred, which is 13.8 billion years, just a little after the Big Bang signalled the start of our Universe.
  • The nice Monkey Simulator program was written by Aaron Russell and is available from this website.
  • Around 30% of the characters in Shakespeare are the letter ‘e’. We could have given the monkey a head-start with this information, but filling in the other 70% is still quite tricky.

  • Just to show that that theory does not always meet practice, a wonderful arts project in Paignton Zoo put a computer in a monkeys’ enclosure to see how they got on. The monkeys typed 5 pages, mainly the letter 's', and then used the keyboard as a toilet. So rather a limited output of classic English literature. This just shows the problems that turn up when maths meets the real world.


The horizon blurb says: "In an infinite universe, there are infinitely many copies of the Earth and infinitely many copies of you." This is sometimes thought to be a consequence of Bayesianism. Discuss? Dave M

Give an infinite number of monkeys an infinite amount of typewriter time one will eventually get the complete works of Shakespeare. Someone has added, “now thanks to the internet we know this to be false.”

Bernard Koopman, one of my mentors, put it thus: Suppose we wanted to replicate a single line of Shakespeare, say 100 characters, by setting a million monkeys randomly tapping away at a million characters per second. With a choice of 34 characters, there are 100^34 (0r 10^68) different 100-character strings. The million monkeys are producing 10^12 characters/second, so 10^12 new random lines are generated every second (if we assume that each new character appends to the 99 prior ones). So it will take 10^(68-12) or 10^56 seconds to complete all possible 100-character lines. If we estimate the universe to be at most 10^14 years old, it has been around for 3600 x 24 x 365.25 x 10^14 seconds, or ~3 x 10^21 seconds -- let's be generous and call it 10^22 seconds. So every 100-character line will be replicated by these monkeys in only 10^(56-22) or 10^34 lifetimes of the universe!