Design a site like this with
Get started

In my last article, I’ve been writing about the spreading of COVID-19 without really inferring the structure of the process. I provided some visualization tools and interactive widgets to have an overview of the phenomenon throughout time.

Here, I’m going to dwell on the modeling techniques which can be used to understand the diffusion of not only diseases, but also other types of contagious phenomena (spreading of ideas, fake news, viral marketing…).

The model I’ll be focusing on is the so-called SIR model, which stands for Susceptible, Infected and Recovered. The SIR is a Compartmental model where the population is divided into compartments, with the assumption that every individual in the same compartment has the same characteristics. The origin of compartmental models trace back to the early 20th century, with an important early work being that of Kermack and McKendrick in 1927.

The basic structure of the model, where we do not consider any birth/death process, can be designed as follows:

So we have a group of susceptible individuals who can get sick with a transmission rate Beta, defined as the chance of contact with an infected individual times the probability of disease transmission. Once infected, individuals can recover with a recovery rate of Gamma, given by the inverse of the duration of the illness. The sum of the three compartments – susceptible, infected and recovered individuals – return the total population N: S+I+R = N.

We can characterize the dynamics of the disease by setting up a system of differential equations, which returns, for each compartment at each point in time, the number of individuals in that compartment:

There are 3 things to notice here:

  • Susceptible individuals can only decrease;
  • Infected individuals can either increase or decrease, depending on the values of Beta and Gamma and the initial condition of S, I, R;
  • Recovered individuals can only increase.

What determines whether a virus spreads or extinguishes? Looking at the second equation, we can determine under which conditions the right-hand side is positive, leading to the first derivative of I being positive (hence, the number of infected is increasing), under the assumption of a totally initial susceptible population (S=N):

So we want the quantity Beta/Gamma to be less than 1 if we want the epidemic to slow down. Note that this ratio has a very important meaning in epidemiology: it is the basic reproductive number, or Ro, and it indicates the average number of infected by an infectious individual in a totally susceptible population.

How can we lower Ro such that it turns less than 1? If we cannot do anything about the recovery time, which is rather physiological, we can operate on Beta. Let’s have a deeper look at how beta is defined:

We have b, which answers the question: given that a susceptible individual get in contact with an infected, which is the probability for him to get infected too? On the other hand, we have c, which indicates the average number of contacts between infected and susceptible individuals.

If you think at the very first and with highest impact measure taken by governments in the current situation, which is social distancing, you can now get the ratio behind: social distancing simply reduces c, leading to a reduction in beta too. If beta decreases, Ro does the same, hopefully under 1.

Now, let’s apply what said above to the current situation of COVID-19 (imagining a situation where there is no mortality rate related to the disease).

In past weeks, daily data have been leading to some information about the parameters of our model. First of all, the current estimates of Ro are around 2.5, but it differs across countries. Hence, during our simulation we will consider a range of 2.5-4, so that we can simulate different scenarios. Furthermore, the recovery period is estimated to be 3.6 days, which corresponds to a Gamma=1/3.6 = 0.012.

Having Ro and Gamma, we can easily compute Beta as Ro*Gamma.

I ran my first simulation using N=7.6 billions (estimate of world population) and starting with only one infected, so that I=1.

It is quite scaring to see how, after only 60 days, the number of infected jumped to almost 2 billions.

Now let’s examine different scenarios, namely with an Ro=4 (which, unfortunately, is not the upper bound of the current estimates):

Here the number of infected, after less than 30 days, jumped to almost 3 billions! And remember that we only had 1 infected individual at time 0.

From the pictures above, you can well understand that, if not addressed, the virus will naturally extinguish only after having infected almost a half of the word population.

Note that the simulation above does by no mean depict the current spreading of COVID-19: many features (like mortality rate, introduced quarantine etc.) are missing. The only aim of the simulation was showing how a virus with the same reproductive number of COVID-19 (and same recovery rate) can lead to an exponential diffusion without interventions. That’s why it is pivotal to contain the phenomenon and keep practicing social distancing, so that we can lower c and, consequently, turning Ro below 1.


Published by valentinaalto

I'm a 22-years-old student based in Milan, passionate about everything related to Statistics, Data Science and Machine Learning. I'm eager to learn new concepts and techniques as well as share them with whoever is interested in the topic.

Leave a comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: