My notes for this page:

It’s all about the goal: One-tailed and two-tailed significance tests.

30 It’s all about the goal: One-tailed and two-tailed significance tests.

Slide 0

Welcome to this last episode about statistics and probability. Today’s episode is about the significance of a statistical result. 

Slide 1

In particular, we will again look at hypothesis testing.

We will deal with a substantiated, statistically reliable statement about whether a hypothesis will be rejected or not with respect to the general population.

We already saw in the previous episode that errors can occur when a hypothesis is rejected or not rejected. We would therefore like to know a little more about a potential error. The means for this is one-tailed and two-tailed significance tests.

We’ll now look more closely at this.

Slide 2

As is almost always the case, a systematic approach should be helpful.

If we decide to reject or accept an initial hypothesis H0, then these four possibilities obviously come into question.

If the hypothesis is actually correct and we accept it after the test, then everything has gone well. The same thing applies if the hypothesis is not correct and we reject it after testing. In the two other cases, however, we are very obviously making an error.

Slide 3

Let’s take the following hypothesis as an example: A bright green face mask reliably protects a person from influenza. If that is actually the case and we erroneously reject the hypothesis, then hopefully people will protect themselves from influenza using other means. However, if in reality a bright green face mask provides no protection from influenza and people nevertheless assume it is effective, then the error is definitely more serious. With such masks, people will erroneously think they are on the safe side.

At any rate, we can conclude that, in general, very different errors are at stake. We therefore label them differently and speak of a type I error when we erroneously reject a true hypothesis and of a type 2 error when we erroneously accept a false hypothesis.

Slide 4

Let’s get to the significance tests; the type 1 error plays the key role here.

A significance test involves determining whether a null hypothesis is being erroneously rejected, thus whether a type 1 error is being made. A significance level of α is specified in the process.

We distinguish two possibilities that we will first look at rather technically. These are the lower-tail significance test and the upper-tail significance test. But don't be afraid if it looks somewhat abstract; clear, concrete examples will follow immediately:

A lower-tail significance test is used to verify in the direction of smaller values. In this case, we have a null hypothesis written as H0: p = p0 (or also H0:  p = p0), which is opposite to an alternative hypothesis written as H1:  p < p0.

An upper-tail significance test is used to verify in the direction of greater values. In this case, we have a null hypothesis written as H0: p = p0 (or also H0:  p = p0), which is opposite to an alternative hypothesis written as H1:  p > p0.

Slide 5

The basis for a one-tailed significance test is the TYPE 1 ERROR.

We specify a significance level α.

With null hypothesis H0, we test whether the real probability p for an event is at most as great as a defined p0.

The probability that the null hypothesis will be erroneously rejected should not be greater than the significance level a.

Does that sound much too abstract again? Don’t worry; we’ll look at examples.

Slide 6

The first example addresses the lower-tail significance test.

Generally, 20 percent of students will fail a mathematics test. This year, the curriculum has been adapted. We want to check whether now a lower percentage of students has failed the test (and we’re interest in only this change at the lower tail).

Slide 7

And now another example in which an upper-tail significance test would make sense.

A fruit merchant buys apricots at the wholesale market. He observes that 5 percent of the apricots have rotten spots and suspects that this portion is higher than in the previous delivery (and he’s interested only in this change at the upper tail).

Slide 8

And of course, everything is a question of describing the problem specifically. You can also swap the lower and upper tails.

You can just as well assume that 80 percent of students have passed the test. The question is whether a few more have passed after the adaptation. In just the same way, the merchant can observe that 95 percent of the apricots were good and ask whether now fewer are good.

The only important aspect is to make this observation clear and unambiguous, and of course to do this before performing the significance test.

Slide 9

Let’s try it using specific numbers. Here is the exercise:

In grade 8a at the Marie Curie School, 6 of 32 students have failed a math test.  “Good grief, at least one-quarter have no idea about math,” the teacher groans.

Is that true? It’s actually only a little greater than 18 percent who have taken the test unsuccessfully. Can the statement be confirmed at a significance level of 5 percent?

First try to find a solution yourself, and keep in mind what we worked out in episode 29.

Slide 10

This is the initial hypothesis: At least one-quarter means p0 = 0.25. We set α = 0.05.

And we can use these values to calculate. Indeed, and .

The highly emotional hypothesis should thus not be rejected. At most there should be only three students who fail the test to reject the hypothesis with reasonable probability. But maybe this class will do better on the next test ;-).

Slide 11

Let’s stay with the topic and look at another math test. On the one hand, we understand problems of this sort right away, and on the other hand, they work really well for calculating.

In grade 8 at the Marie Curie School, 18 of 100 students have failed a math test. “Good grief, at least one-quarter have no idea about math,” the director groans.

Is that true? After all, it’s exactly 18 percent, so clearly less than 25 percent. Will it also work this time with a significance level of 5 percent?

Here as well, try to find a solution to the exercise.

Slide 12

This time as well, the hypothesis is p0 = 0.25. In the same way, we again set α = 0.05.

Indeed, in this case

.

The highly emotional hypothesis should thus not be rejected this time either. However, it looks tight, right?

For this reason, we will look at yet a third math test. Don’t worry; it’s the last one for today.

Slide 13

In various 8th grade classes, 180 of 1,000 students have failed a math test. “Good grief, at least one-quarter have no idea about math,” the secretary of education groans.

Is that true? Will it work again with a significance level of 5 percent?

Slide 14

Once again, the hypothesis is p0 = 0.25.  Likewise, we again set a significance level of α = 0.05.

Then,

The hypothesis should therefore be rejected in this case.

A type 1 error is possible in this case, but it’s rather improbable because

.

For this number, you won’t find any decimal places that are not zero until way after the decimal point.

Each time we had the same hypothesis, and each time the real value was 18 percent.  Isn’t that interesting! Obviously, the size of the sample plays a major role.

Slide 15

Let’s also look at the two-tailed significance test.

If a hypothesis test is about determining whether the probability of an event changes compared to the assumed value and the direction doesn't matter, then this is a two-tailed significance test.

Specifically:  For a two-tailed significance test, we have a null hypothesis H0: p = p0, which is opposite to an alternative hypothesis H1:  p ? p0.

Slide 16

Here is an example. We will flip a coin again and check whether a coin is fair, thus a coin suitable for a Laplace experiment.

We flip the coin 100 times, and we expect p = 0.5.

However, if “heads” turns up fewer than 45 times or more than 55 times, then – so we think – the coin cannot actually be fair, because then p ? 0.5 and that is obviously not okay.

However, we initially assume that the coin is fair. That is the initial hypothesis H0. We calculate the probability that the aforementioned event would occur. 

Accordingly, random variable X = “number of heads” is binomially distributed with the parameters 100 and 0.5.

Slide 17

Let’s calculate.

Clearly, the hypothesis that we have a fair coin cannot be rejected. We would be wrong with a probability of a bit over 27 percent. That is far too high to discard the null hypothesis.

Slide 18

The coin flip would probably have to end with results that deviate considerably more if the hypothesis of fairness is to be discarded.

Let’s use new values; “heads” should appear less than 40 times or more than 60 times. It looks like this:

P(X < 40 or X > 60 | p = 0.5)

 

The result is now distinctly clearer. The probability that the hypothesis will be wrongly rejected is now only about 3.5 percent.

In other words, in this case the risk is low of stating that the coin is “unfair” even though it actually is fair.

Slide 19

In this case, a lot also depends on how large the sample is.

We’ll look again at a result in which “heads” comes up in fewer than 45 percent or more than 55 percent of the cases. Let’s flip the coin 1,000 times.

P(X < 450 or X > 550 | p = 0.5)

Obviously, the hypothesis should now be rejected; the probability of error is low.

Do you remember the law of large numbers? It shows up here in the practical calculations.

Slide 20

Obviously, you could also approach this in the other direction and specify a value for the maximum permitted probability of error – and that is the significance level.

So let’s select α = 0.01 and n = 100, thus a maximum probability of error of 1 percent.

Then   and

,  so

.

This a would be exceeded for the range [37,63]; so we would not reject the hypothesis of a fair coin.

Slide 21

Let’s look at the possible errors again. As we have already seen, we can easily determine them using a systematic approach. By the way, we also talk about an a error if we mean a type 1 error and about a β error if we mean a type 2 error.

Here you see the corresponding table again.

Slide 22

Let’s apply a systematic approach to the example of the fair coin.

The initial hypothesis is: The coin is fair and especially genuine. For an α error, this hypothesis will be wrongly rejected. Then we’re throwing good money into the trash can. For a β error, this hypothesis will be wrongly accepted. Then we’re putting counterfeit money into circulation. Judge for yourself which is more forgivable.

Slide 23

Usually, a hypothesis (which encompasses the opposite of what you want to prove) is not rejected until the probability of error is less than or equal to 5 percent (“significant”) or often less than or equal to 1 percent (“highly significant”).

But to be very clear:  These are agreed values that you could select in other ways.

We speak in general of the significance level.

Slide 24

You know that today is the last episode in which we are talking about mathematics. This means we can’t leave the gummy bears out.

On the left side, you see my balance of gummy bears yesterday at this time. In statistical terms, everything was okay; the individual colors did not deviate significantly from the one-sixth that we would expect.

Then someone who likes green and dark red gummy bears the best visited me. You see the balance after the visit on the right. Did my visitor cause a statistically significant deviation?

Slide 25

Calculate for yourself. Seventy gummy bears remain in total.

We assume a probability of p = 1/6 for each of the six colors.

At a significance level of α = 0.05 = 5 percent, then actually only the number of white gummy bears is significantly too high. And at a level of 1 percent, in fact none of the colors are represented too often or too little anymore.

That’s just the way it is: If the sample is too small, then huge deviations are needed so that they prove to be statistically significant.

Slide 26

That was it, and not only for today; that was the training on statistics and probability. Working with you was a lot of fun for me. I really hope that you have been able to gather many ideas for your teaching and will implement them. Good-bye, take care, and don’t forget: Almost everything in life is uncertain; you should simply try to better understand these uncertainties. How good it is that mathematics is a reliable aid in this.

Tip: Log in and save your completion progress

When you log in, your completion progress is automatically saved and later you can continue the training where you stopped. You also have access to the note function.

More information on the advantages