Categories
physics

Monte Carlo Simulation: An Introduction With An Objection To The Imperial COVID Model

Introduction

We have just far spent £337bn on the COVID response in the UK. This is reported by the FT at: https://www.ft.com/content/f0c7ab6d-33ba-4777-8dd8-0960a78a556a The virus itself generated much of this spend, but much of it was generated merely by the lockdown.  The results of the Imperial model were a primary motivation for lockdown. That model was a Monte Carlo Simulation.  I explain here briefly what a Monte Carlo simulation is and bring out one objection (among many) to the Imperial approach.

What is Monte Carlo Simulation?

Monte Carlo simulations exist to address a class of problems which do not have an analytical answer.  Imagine I am in the pub and my home is 40 paces away.  If I walk at two paces a second, I will arrive home in 20s.  That’s an analytical question which has an exact answer.  Here is a non-analytical question.  I have drunk a bottle of tequila in the pub.  The probability that I take a pace forward is only 50%; there is an equal probability that I take a pace backward.  This does not appear to be analytical.  You can’t say immediately what the probability is that I get home.

This is where Monte Carlo Simulation comes in.  What you can do is simulate my journey home from the pub and see if I get home.  (It’s called Monte Carlo because the method is based on random numbers, as we will see next.)

Sample Python Code

A very simple Monte Carlo

Here’s a really simple Python script called Randwalk that simulates the walk home.  It’s called that because this is a random walk simulation.  This sort of thing might be familiar to you from Brownian motion.

You can see that all it does is roll a dice 100 times and check to see if the dice shows three or less.   That’s the 50/50 part.  If the dice does show three or less, I take a step back.  If the dice shows more than three, I take a step forward.  This is repeated 1000 times, meaning I take 1000 steps in this version.

This entire code consists of a conditional statement in a loop.  It’s extremely simple.

Output from Simple Python

We can then plot the output and we will see something like the below.

Output from a random walk simulation

As you can see, a jagged pattern is generated.  On this occasion, plenty of my steps were unfortunately in the wrong direction, and I never got home.  But that won’t happen every time.  I was luckier in the run below, or maybe I drank less tequila.

Random walk output in a better scenario

As you can see, here I was more than 40 paces north in this simulation.  So I got home.  (I haven’t bothered to clean up the code to stop running when I “arrive home” in order to keep it simple, but that could easily be done.)

Using Monte Carlo on the Random Walk Problem

So now we see how we can use Monte Carlo Simulation to answer the original question.  What we need to do is run scenarios like the above a large number of times and see how often I get home.

A more complex piece of Python to loop over scenarios

Here is some only slightly more complicated Python called SimpMC which does that.

This just puts the whole of the previous code in another loop.  So we do the simulation multiple times — that variable is itot = 10 in the code above.  We then calculate the fraction of scenarios in which I get home.

Monte Carlo Results

This generates an answer of 0.2. But it is different every time I run it.  Sometimes I get 0.3 and sometimes 0.4.  That is happening because I have inadequate statistics.  So let’s set the run number to 100.

Now I get: 0.14, 0.17, 0.21, 0.19, 0.15.  Better but still not stable.  Let’s set the run number to 1000.

Now I get: 0.195, 0.191, 0.208, 0.192, 0.205.  That’s starting to get there.  I am clearly converging on a probability estimate here.  If I ran overnight, I would get a good answer.

Why is this an Objection to the Imperial Model

Finally to the objection to the Imperial model.  Their code was unstable on multiple cores.  Their response to this was “it’s a stochastic model so it will give different answers each time.”  That response does not fly, as I will now explain.

Saying it is a stochastic model just means it uses this random number Monte Carlo approach.  However — that does not mean it should produce different outcomes when run on multiple cores.  It should not be unstable at all.  The reported instability in the Imperial model is 80,000 deaths. This means that merely the error bar in the Imperial result is larger than the current total number of COVID deaths! — and that should not happen.  To claim otherwise is to mix up the randomnesses.  (I just made that word up but that seems fine.)

For sure, we saw randomness in the randwalk code — but that was just one run.  When we did lots of runs in the SimpMC code, we started t0 converge.  We got the same result every time in other words when we did enough runs.  The Imperial model produces different results each time you run a large number of scenarios through it with the same parameters.  That is equivalent to me getting different answers on the 1000 run version of SimpMC.  If that happens, it doesn’t mean I wrote a stochastic model.  It means I wrote a buggy model.  Imperial have potentially cost us a lot of money here.