My notes for this page:

Using stochastics in everyday life: A story with happy fish sticks.

Slide 0

I hope you're having a wonderful day. Welcome. Today we’ll talk about statistics and how to correctly understand the results. For sure this is a serious matter. Just the same, we’re going to begin with a not-so-serious example.

Slide 1

Sit back and enjoy a story that is as imaginary as it is frivolous.

Once again, it’s about the principle, and you can also understand the principle with stories that are not quite so serious. But never fear, at the end of the episode we will show a (halfway) decent example. Seahorses play a leading role in the story. In the photo you see an especially beautiful specimen, in my opinion.

Slide 2

Some time ago, the Süddeutsche Zeitung newspaper reported on a new meat scandal in Streiflicht [searchlight], a column that doesn’t always tell serious stories. The question was whether the fish sticks loved by many children really contained meat from seahorses. Seahorses are also much loved, although preferably in a swimming form.

We conducted interviews about this with experts – you see them pictured here. And indeed, these experts assume that about 1% of the fish sticks are contaminated, thus one in one hundred fish sticks contains meat from seahorses.

Slide 3

The good news: There is an at-home test that detects whether seahorse meat was mixed into the fish sticks. If applicable, it triggers a seahorse alarm.

While the test detects the mixed-in seahorse meat in most cases, it doesn’t always. But still, it yields the correct result in 90% of the cases. Is that sufficient to be on the safe side?

Slide 4

Assume we conduct the test with one fish stick and receive a positive test result, thus the dreaded seahorse alarm. Evidently, this test result can also be erroneous, meaning it can yield a false positive.

What is the probability that a fish stick really contains seahorse meat when it tests positive?

Slide 5

Take a guess and write down the result.

Assume we conduct the test and receive a positive test result. Is the fish stick then definitely contaminated? Does this apply in about 90% of the cases? Or only in about 50% of the cases? Or it is even fewer than 10% of the cases in which a positive test result is actually true and correctly indicates that seahorse meat is contained in the fish stick.

Slide 6

Let’s approach the matter systematically – as we have done so often in the past. If 1% of the fish sticks are contaminated, that’s one in 100 or 10 in 1,000. In this example, we’ll calculate for 1,000 fish sticks. Then you will find seahorse meat in 10 fish sticks and 990 fish sticks will not contain it.

Slide 7

Now, the test detects the correct result in 90% of the cases, which is 90 out of 100 or 900 out of 1,000 or 9 out of 10. Of the ten fish sticks that contain seahorse meat, nine are correctly detected. But one slips through. It is incorrectly tested as negative.

What is the situation for the other 990 fish sticks? Similar, of course. 90% will test negative, which is 891 fish sticks; 10% – and that is 99 – will incorrectly test positive.

If you add up 99 and 9, then 108 fish sticks are tested positive – which includes correct positives and false positives.

Slide 8

But only a few fish sticks are correct positives, precisely nine. So, we divide 9 by 108 and see that their portion is about 8%. Amazing, right?

Slide 9

What did you guess? If you weren’t already familiar with the problem, then you probably guessed much higher. As is so often the case with questions regarding statistics, perception plays a trick on us and we intuitively underestimate the rather high number of false negative results.

Slide 10

Where is the problem? Surely, specialists handle statistics properly, right?

No, this doesn’t seem to be quite so simple. And then the evaluation definitely becomes a serious matter. In a study, for example, 60 percent of surveyed doctors answered a similar question related to diagnosing diseases incorrectly.

We’ll look at this once more in detail based on a very serious example.

Slide 11

This isn’t anything new: Handling data unfortunately doesn't mean handling the data sensibly and appropriately. However, the importance of handling data correctly to everyday life is undisputed.

Slide 12

COVID-19, again. Even if the topic might be annoying, it serves us well, especially for lessons in statistics.

The reliability of rapid tests has been discussed many times. To evaluate the consequences, systematically using statistical methods again can’t hurt.

False positive test results are also possible in this case. This means that a person has received a positive test result but is not actually infected with SARS-CoV-2, the virus that causes COVID-19.

Many factors play a role in the reliability, one of which is the viral load of the affected person. Naturally, we cannot work so specifically in this case. But let’s go through the situation in simple terms. Of course, using very, very simplified assumptions.

Slide 13

The question is: What is the probability that a person who has tested positive for COVID-19 is actually infected with the pathogen of the disease? To evaluate this, we make a couple of basic assumptions.

We assume an incidence of 1,200 and see what happens when a test with a probability of 75% or 99% or even 99.9% shows the correct result. By the way, good tests under good conditions do really achieve very high accuracy rates.

Slide 14

We approach the matter systematically. Let’s assume that we’re in City A that has a population of 100,000.

Currently, the seven-day incidence rate there is 1,200, meaning that of the 100,000 people, exactly 1,200 people have become infected with the virus in the past seven days. We assume also, and of course not very realistically, that all 100,000 people in City A actually get tested.

Oh yeah, take another guess. Maybe it will turn out better this time.

Slide 15

A seven-day incidence rate of 1,200 means that 1,200 of 100,000 or 1.2 of 100, thus 1.2% of the people, have become infected. We assume that the test detects the correct result in 75% of the cases. We then determine the relevant numbers exactly as we did with the fish sticks.

Of 100,000 people, 1,200 are infected and 98,800 are not. No matter whether a person is infected or not, the test yields the correct result in 75% of the cases, which is three-fourths. Consequently, of the 1,200 infected persons, 900 are correctly identified. However, 300 receive a false negative result. Likewise, of the 98,800 uninfected people, three-fourths are correctly detected. This means that 74,100 of these persons receive a correct negative result. However, one-fourth – which is 24,700 people – receive an incorrect, namely positive result. In total, 24,700 + 900 = 25,600 people receive a positive test result.

Slide 16

Therefore, with such a test, it’s really only 3.5% of those who receive a positive test result who are infected with the virus.

Once more: our sense of such situations is not actually well developed. It makes sense to sketch the situation precisely and calculate accordingly.

Slide 17

Assume that the test works more reliably and shows a correct result in 99% of the cases. I don’t want to repeat all the numbers again right now; you can calculate them yourself. I came up with a total of 2,176 people with positive test results, of whom 1,188 are actually infected.

Slide 18

Thus, we’re now at about 54.6% – approximately half, still a high value. It means that nearly half of the persons get a false positive result.

Slide 19

Let’s do another experiment in a nearly perfect world. Assume that the test works even more reliably and shows a correct result in 99.9% of the cases.

Then, a total of 1,298 people test positive, of whom almost 1,200 actually carry the virus.

Slide 20

The accuracy rate increases to over 90%. However – and this is important – even in a nearly perfect world, it cannot be 100% if the testing reliability is below 100%.

Slide 21

Please don’t forget that we have worked with a very simple model and that reality is far more complex. Obviously, the calculation depends on many factors. For instance, how many people get tested is important and also whether a lot of symptom-free people undergo a test.

We just wanted to explain the principle ... and show that mathematical models are by no means rocket science. Mathematics is not mystical, rather it follows rational considerations that are often quite easy to understand. Do believe me when I say that if you do the calculations step by step, you are very often successful.

Slide 22

That’s it for today. Many thanks for being here. I look forward to the next time.

Tip: Log in and save your completion progress

When you log in, your completion progress is automatically saved and later you can continue the training where you stopped. You also have access to the note function.

More information on the advantages