My notes for this page:

Applied statistics: Figures of a pandemic.

Slide 0

Hello everyone and welcome to a new episode in which we’ll talk about mathematics. This time it is a very special episode because we will deal exclusively with figures that became very important during the Covid-19 pandemic.

Slide 1

We will again address exponential functions, which we already looked at in detail in the last episode. This time we’ll cover them in the specific context of the pandemic. You have also already heard of incidence and the R value. We’ll also look at these terms in detail and based on examples.

Slide 2

Let’s consider what the exponential functions have to do with the Covid-19 pandemic.

Let’s assume that at the beginning of a pandemic, there are 10 sick people in a city. Now let’s assume that this number doubles every week. You see the numbers. After seven weeks, there would be 1,280 sick people, and after double the time, thus after fourteen weeks, there would already be 162,840 sick people, and that’s more than 100 times as many.

There would already be more than one million after week 17.

Slide 3

But no, reality is generally not as unpleasant as mathematical theory.

However, there are notable connections between theory and practice. We want to look at this based on an example.

Slide 4

As is so often the case, in practice we must go into detail. We were forced to acknowledge that the number of newly infected persons climbed worldwide for weeks. However, the real life of course doesn’t behave quite as smoothly as the model that we just saw.

Let’s go to Spain in summer 2021. Here, there were this many cases of covid on the following dates (rounded, and without responsibility for the correctness of the numbers):

06/23/2021    3.400 new infections

06/30/2021    4.400 new infections

07/06/2021    10.600 new infections

07/13/2021    15.400 new infections

07/20/2021    27.400 new infections

These are alarming numbers that conceal much personal suffering.

Slide 5

Now let’s look at the mathematics behind the figures. Here it really isn’t about linear growth, in which every day a more or less fixed number would be added. In concrete terms, there were 1,000 new infections from week 1 to week 2, about 6,000 more from week 2 to week 3, about 5,000 from week 3 to week 4, and 12,000 from week 4 to week 5.

At the right, you see that these numbers follow a varying pattern because the resulting factor isn’t always the same, meaning that you don’t always multiply by the same number. For instance, the new infections rose by a factor of 1.3 from the first to second week, by a factor of 2.4 from the second to the third week, by a factor of 1.5 from the third to the fourth week, and by a factor of 1.8 from the fourth to the fifth week.

Slide 6

A small consolation for Spain: Afterwards the number dropped again, and once more I calculated the corresponding factors. They are all less than 1 – obviously, because a decrease isn’t possible with a factor greater than 1.

We refer to this as exponential decay – and during the first three weeks it even took place like a textbook example with the constant factor of 0.9.

Slide 7

Absolute numbers are sometimes difficult to interpret because in this case, they depend on the population of a country. For this reason, we often resort to relative numbers. And this results in the incidence.

We saw this before in an earlier episode. You divide the number of new infections over the last seven days by the population and multiply this number by 100,000. That last operation is nothing more than shifting the decimal point to the right five places – one place for each zero in the number 100,000.

Germany has a population of about 83 million. If you divide the number of new infections = 92,877 by 83 million, you get 0.00112 and accordingly an incidence of 0,00112 • 100.000 = 112 for this week in March shown here.

A reminder that the figures shown here are somewhat old. However, since the way things stand constantly changes anyway, you can very clearly see the principle used to calculate even with these numbers.

Slide 8

The incidence thus shows how many people per 100,000 population tested positive for the virus in the course of seven days.

Why do we use the value over seven days? Quite simply, because the figures always fluctuate, for example, due to weekends or holidays. As a result, this value that takes a full week into account is more reliable.

One more thing regarding incidence: Using the same calculation steps (and other figures, of course), we could also determine how many people are hospitalized with an infection relative to the population. In this case as well, the relative numbers that take into account the population are interesting. And it is also important to balance out fluctuations on particular days, thus to observe the situation over a certain time period.

Slide 9

Let’s look at the R value, another number related to covid.

It is referred to as the reproduction number. It indicates how many other people catch the virus from one infected person. So, R = 1.25 means that 100 sick people will infect another 125 people. The value R = 0.75 means that 100 sick people will infect another 75 people.

Caution: The reproduction number is an estimated value. It cannot be determined exactly. Accordingly, in the examples that we just looked at, the stated figure will only roughly coincide with reality. And above all, the estimated R values of 1.05 and 0.95 are not necessarily different in reality.

The value is determined up to a likewise estimable error, called the confidence interval. We’ll talk about that in detail in a later episode, but let us attempt a first approximation here.

Slide 10

What is a confidence interval?

First, it is a key term in statistics, but not exactly easy to understand.

It reflects the fact that in this mathematical field, there is normally no simple right or wrong, but in most cases a person can only approximate the facts.

Imagine that you poll 1,000 randomly selected preschoolers on what their favorite color of gummy bear is. In this survey, 450, thus 45%, choose orange gummy bears. Then it is of course reasonable that another representative random sample would result in a similar value – perhaps 43%, perhaps 44%, perhaps also 46% or 48% or 49%.

In other words: There are many indications that a second measured value would lie in a rather narrow interval around 45%. But, of course that isn’t certain; it is again only probable.

In practice, people usually provide an interval so that the correct value has a 95% probability of being within that interval. However, this also means that it has a 5% probability of lying outside the interval. Once again: at this point we simply have to accept the uncertainty.

Unfortunately, we cannot simply calculate the confidence interval. In particular, it depends on the size of the random sample. With regard to covid-19, we therefore need information on how many new people caught covid from one infected person. Very clearly, we are dealing with numbers that are difficult to gather.

Slide 11

We can state it roughly like this: The confidence interval is a range around a measured value within which the true value very likely lies.

Slide 12

Let’s review briefly. We have covered three terms.

Number 1: Exponential growth. Clearly extremely dangerous related to the numbers of infected people. Good if it is recognized.

Number 2: Incidence. Is an important indicator of the course of a pandemic. But a word of caution: such numbers can never actually be precisely measured. Testing errors occur in both directions (false positives and false negatives). And some infected people were never tested.

Number 3: Reproduction number. This is an approximate value that could deviate downward or upward. This uncertainty must be taken into consideration when the value is interpreted.

Yeah, as already briefly mentioned, there are other indicators of the course of a pandemic, such as the number of sick people in intensive care units or the number of people who suffer long-term effects. But they also have something in common: The necessary data are not easy to collect and are associated with a degree of uncertainty. No sound evaluation can get by here without statistics.

Slide 13

That’s all for today. It’s nice that you were here. Thank you for your interest.

Tip: Log in and save your completion progress

When you log in, your completion progress is automatically saved and later you can continue the training where you stopped. You also have access to the note function.

More information on the advantages