Section 5.3 Hypothesis Tests for a Proportion
¶Testing Claims About a Proportion.
The second type of hypothesis test that we will study is the test for a population proportion. The following situations will provide the examples for this section.
A pharmaceutical company has developed a new drug for the common cold. They claim that this drug shortens the duration of the cold in 80% of individuals. A physician believes that this proportion is inflated. He finds 115 individuals who have just developed a cold, administers the drugs, and observes that 89 of them recover more quickly than is typical.
A government official claims that 50% of cars on the road have under-inflated tires, and are therefore getting less-than-optimal gas mileage. To test this claim, a random sample of 600 cars is selected and 279 of them are found to have under-inflated tires.
A scout leader claims that being a boyscout promotes character development and respect for the law. In fact, he claims that fewer than 10% of boys who spend at least one year in a boyscout troupe will get into trouble with the law before their 20th birthday. To test this claim, you take a random sample of 80 names from the rosters of boyscout troupes fifteen years in the past and find that 11 of these former boyscouts have had trouble with the law.
Objectives
After finishing this section you should be able to
-
describe the following terms:
Hypotheses for a Single Population Proportion
Test Statistic for a Single Population Proportion
-
accomplish the following tasks:
Formulate null and alternative hypotheses for tests of a single proportion.
Compute the test statistic for a single proportion.
Use this test statistic to conduct a traditional hypothesis test.
Use this test statistic to conduct a p-value hypothesis test.
Understand and identify type I and type II errors.
Subsection 5.3.1 Formulating Hypotheses
¶We start once more by identifying the null and alternative hypothesis. When testing a claim about a single population proportion, the three basic types of null/alternative hypothesis combinations are as follows.
Principle 5.3.1. Hypotheses for a Single Population Proportion.
To test a claim about a single population mean, we use one of the following sets of hypotheses, where \(p_0\) is a given value.
-
Left-Tailed.
\begin{align*} H_0\amp:\ p \geq p_0\\ H_A\amp:\ p \lt p_0 \end{align*} -
Two-Tailed.
\begin{align*} H_0\amp:\ p = p_0\\ H_A\amp:\ p \not= p_0 \end{align*} -
Right-Tailed.
\begin{align*} H_0\amp:\ p \leq p_0\\ H_A\amp:\ p \gt p_0 \end{align*}
Let's look at each of the three examples from the introduction and see if we can determine which of these sets of hypotheses should be used.
Example 5.3.2. Stating Hypotheses for a Left-Tailed Test.
A pharmaceutical company has developed a new drug for the common cold. They claim that this drug shortens the duration of the cold in 80% of individuals. A physician believes that this proportion is inflated. He finds 115 individuals who have just developed a cold, administers the drugs, and observes that 89 of them recover more quickly than is typical. Find the null and alternative hypotheses for this test.
The physician's claim is that the proportion of 80% is too big. This does not involve inequality, and should therefore be the alternative hypothesis. This gives us a left-tailed test with the following hypotheses.
Example 5.3.3. Stating Hypotheses for a Two-Tailed Test.
A government official claims that 50% of cars on the road have under-inflated tires, and are therefore getting less-than-optimal gas mileage. To test this claim, a random sample of 600 cars is selected and 279 of them are found to have under-inflated tires. Find the null and alternative hypotheses for this test.
The official's claim is that the proportion of cars on the road with under-inflated tires is exactly 50%. This involves equality, and is therefore the null hypothesis. The following are the hypotheses for this two-tailed test.
Example 5.3.4. STating Hypotheses for another Left-Tailed Test.
A scout leader claims that being a boyscout promotes character development and respect for the law. In fact, he claims that fewer than 10% of boys who spend at least one year in a boyscout troupe will get into trouble with the law before their 20th birthday. To test this claim, you take a random sample of 80 names from the rosters of boyscout troupes fifteen years in the past and find that 11 of these former boy-scouts have had trouble with the law. Find the null and alternative hypotheses for this test.
Finally, the scout leader claims that the proportion is less than 0.10. This is an alternative hypothesis since it does not involve equality. Therefore, we get a left-tailed test with hypotheses:
Checkpoint 5.3.7.
A self-proclaimed psychic claims that he can predict the flipping of a coin. To test his claim you flip a coin 120 times and have him attempt to predict the outcome. He successfully predicts 71 of the flips. Based on this sample, you perform a hypothesis test.
Question: what should your null hypothesis be?
\(p = 0.5\)
Checkpoint 5.3.8.
A politician claims that a majority of voters support his stance on a certain issue. To verify this claim, he has his staff contact 500 voters and finds that 258 of them support his position.
Question: what is the null hypothesis in this hypothesis test?
\(p \leq 0.50\)
Subsection 5.3.2 Computing the Test Statistic
¶When testing a claim about a population proportion, the test statistic measures how unusual the observed sample is if the null hypothesis is true. The test statistic is really just a z-score for the sample proportion based on the assumption \(p\) is as indicated in the null hypothesis. Below we remind you of this formula in the context of a hypothesis test.
Theorem 5.3.9. Test Statistic for a Single Sample Proportion.
The test statistic for a sample proportion \(\hat p\) used to test the assumption of the null hypothesis that \(p = p_0\) is:
When computing a test statistic, the null hypothesis must give us one value for \(p\text{.}\) In the case of a two-tailed test, the null hypothesis that \(p = p_0\) does just that. In a left- or right-tailed test, we use the “worst-case” value of \(p_0\) from the null hypothesis. That is, even if we have:
\(H_0:\ p \geq p_0\text{,}\) or
\(H_0:\ p \leq p_0\)
we will use \(p = p_0\) in computing our test statistic. Examples of this can be found as we continue working on the problems from the beginning of this section.
Example 5.3.10. Coimputing the Test Statistic for a Left-Tailed Test.
A pharmaceutical company has developed a new drug for the common cold. They claim that this drug shortens the duration of the cold in 80% of individuals. A physician believes that this proportion is inflated. He finds 115 individuals who have just developed a cold, administers the drugs, and observes that 89 of them recover more quickly than is typical. Find the test statistic for this sample.
Recall that the null and alternative hypothesis were:
Using the assumption that \(p = 0.80\) from the null hypothesis, we compute the test statistic for our sample as follows.
Example 5.3.11. Computing the Test Statistic for a Two-Tailed Test.
A government official claims that 50% of cars on the road have under-inflated tires, and are therefore getting less-than-optimal gas mileage. To test this claim, a random sample of 600 cars is selected and 279 of them are found to have under-inflated tires. Find the test statistic for this sample.
In a previous example, we found the null and alternative hypotheses to be:
Under the assumption that \(p = 0.50\text{,}\) we compute the test statistic for this sample as follows.
Example 5.3.12. Computing the Test Statistic for another Left-Tailed Test.
A scout leader claims that being a boyscout promotes character development and respect for the law. In fact, he claims that fewer than 10% of boys who spend at least one year in a boyscout troupe will get into trouble with the law before their 20th birthday. To test this claim, you take a random sample of 80 names from the rosters of boy-scout troupes fifteen years in the past and find that 11 of these former boyscouts have had trouble with the law. Find the test statistic for this sample.
Previously, we found the following hypotheses for this situation.
Under the null hypothesis assumption that \(p = 0.10\text{,}\) the test statistic is as shown below.
Checkpoint 5.3.15.
A self-proclaimed psychic claims that he can predict the flipping of a coin. To test his claim you flip a coin 120 times and have him attempt to predict the outcome. He successfully predicts 71 of the flips. Based on this sample, you perform a hypothesis test.
Question: what is the test statistic for this hypothesis test?
2.01
Checkpoint 5.3.16.
A politician claims that a majority of voters support his stance on a certain issue. To verify this claim, he has his staff contact 500 voters and finds that 258 of them support his position.
Question: what is the test statistic for this hypothesis test?
0.72
Subsection 5.3.3 The Traditional Test
¶To conduct a traditional hypothesis test and draw conclusions, we must complete the following steps.
State the null and alternative hypotheses (done).
Compute the test statistic (done).
Find the rejection region and their critical values.
Compare the test statistic with the critical values to reach your conclusion.
We have already completed steps one and two. Thus, we have only to finish steps 3 and 4 to conduct a traditional hypothesis test in each of our examples.
Example 5.3.17. Conducting a Left-Tailed Traditional Hypothesis Test.
A pharmaceutical company has developed a new drug for the common cold. They claim that this drug shortens the duration of the cold in 80% of individuals. A physician believes that this proportion is inflated. He finds 115 individuals who have just developed a cold, administers the drugs, and observes that 89 of them recover more quickly than is typical. Conduct a traditional hypothesis test at the \(\alpha = 0.05\) significance level.
Recall that the null and alternative hypothesis were:
We computed the test statistic as follows:
Now we must identify the rejection region and the critical value. Since the alternative hypothesis involves \(\lt\text{,}\) this is a left-tailed test with the entire significance level of \(\alpha = 0.05\) in that left tail. This gives a critical value \(z_\alpha = -1.645\) as shown below.
Because our test statistic of \(-0.70\) is not less than the critical value \(-1.645\text{,}\) it is not in the rejection region. We therefore fail to reject the null hypothesis. There is no statistically significant evidence that the pharmaceutical company is inflating the claims about this drug's effectiveness.
Example 5.3.19. Conducting a Two-Tailed Traditional Hypothesis Test.
A government official claims that 50% of cars on the road have under-inflated tires, and are therefore getting less-than-optimal gas mileage. To test this claim, a random sample of 600 cars is selected and 279 of them are found to have under-inflated tires. Use a traditional hypothesis test at the \(\alpha = 0.10\) significance level to test the government officials claim.
The hypotheses for this test were:
And we found that the test statistic was:
The rejection region in this two-tailed test is show below, separated from the body of the standard normal distribution by the critical values \(-1.645\) and \(+1.645\text{.}\)
Since our test statistic of \(z_\text{test} = -1.71\) is in one of these rejection regions, we must reject the null hypothesis. There is evidence tending towards significance that the true proportion of cars on the road with under-inflated tires is different from 0.50.
Example 5.3.21. Conducting Another Left-Tailed Traditional Hypothesis Test.
A scout leader claims that being a boyscout promotes character development and respect for the law. In fact, he claims that fewer than 10% of boys who spend at least one year in a boyscout troupe will get into trouble with the law before their 20th birthday. To test this claim, you take a random sample of 80 names from the rosters of boy-scout troupes fifteen years in the past and find that 11 of these former boyscouts have had trouble with the law. Conduct a traditional hypothesis test to test this claim at the \(\alpha = 0.10\) significance level.
Recall that the hypotheses are:
The test statistic under the null hypothesis above is:
Because the alternative hypothesis involves \(\lt\text{,}\) this is a left-tailed test with the entire \(\alpha = 0.10\) in the left tail. The corresponding critical value is -1.28 as shown below.
Our test statistic of 1.12 is clearly not in the rejection region. Therefore we fail to reject the null hypothesis. There is not even evidence tending towards significance that fewer than 10% of boyscout members have trouble with the law before their 20th birthday. For more examples of conducting a traditional hypothesis test, see the following videos.
Checkpoint 5.3.25.
A self-proclaimed psychic claims that he can predict the flipping of a coin. To test his claim you flip a coin 120 times and have him attempt to predict the outcome. He successfully predicts 71 of the flips. Based on this sample, you perform a hypothesis test.
Question: what decision do you make at the \(\alpha = 0.05\) significance level? Use a traditional test.
Reject the Null Hypothesis
Checkpoint 5.3.26.
A politician claims that a majority of voters support his stance on a certain issue. To verify this claim, he has his staff contact 500 voters and finds that 258 of them support his position.
Question: what decision do you make at the \(\alpha = 0.10\) significance level? Use a traditional test.
Fail to Reject the Null Hypothesis
Subsection 5.3.4 The p-Value Test
¶To conduct a p-value test, recall that we must change the last two steps of the hypothesis testing process. This altered process is as follows.
State the null and alternative hypotheses (done).
Compute the test statistic (done).
Find the p-value for this test statistic.
Compare the p-value with the significance level to reach your conclusion.
Using Example 5.3.17, Example 5.3.3, and Example 5.3.21, we will repeat the hypothesis tests using the p-value approach.
Example 5.3.27. Conducting a Left-Tailed p-Value Hypothesis Test.
A pharmaceutical company has developed a new drug for the common cold. They claim that this drug shortens the duration of the cold in 80% of individuals. A physician believes that this proportion is inflated. He finds 115 individuals who have just developed a cold, administers the drugs, and observes that 89 of them recover more quickly than is typical. Conduct a p-value hypothesis test at the \(\alpha = 0.05\) significance level.
Recall that the null and alternative hypothesis were:
We computed the test statistic as follows:
The p-value for our test statistic in this left-tailed test (because the alternative uses \(\lt\)) is the probability of getting a test statistic further into that left tail than -0.70.
As depicted above, this is
Clearly our p-value is greater than the significance level of \(\alpha = 0.05\) (and in fact would be greater than any commonly used significance level). We therefore fail to reject the null hypothesis. There is no evidence (statistically significant or otherwise) that the pharmaceutical company is inflating the claims about this drug's effectiveness.
Example 5.3.29. Conducting a Two-Tailed p-Value Hypothesis Test.
A government official claims that 50% of cars on the road have under-inflated tires, and are therefore getting less-than-optimal gas mileage. To test this claim, a random sample of 600 cars is selected and 279 of them are found to have under-inflated tires. Use a p-value hypothesis test at the \(\alpha = 0.10\) significance level to test the government officials claim.
The hypotheses for this test were:
And we found that the test statistic was:
Since this is a two-tailed test, the p-value of our test statistic will be twice the probability of being further into the left tail (since the test statistic is negative).
As shown above, this gives
Because this p-value is less than the significance level we reject the null hypothesis. There is evidence tending towards significance that the true proportion of cars on the road with under-inflated tires is different from 0.50.
Note that in the problem above, while we wound up rejecting the null hypothesis at the 0.10 significance level, we would not have rejected it at the 0.05 or 0.01 significance level. In fact, the p-value of 0.0854 tells us that if we decide to reject the null hypothesis, there is an 8.54% chance that we are wrong. Typically that is too big of a chance to take.
Example 5.3.31. Conducting Another Left-Tailed p-Value Hypothesis Test.
A scout leader claims that being a boyscout promotes character development and respect for the law. In fact, he claims that fewer than 10% of boys who spend at least one year in a boyscout troupe will get into trouble with the law before their 20th birthday. To test this claim, you take a random sample of 80 names from the rosters of boyscout troupes fifteen years in the past and find that 11 of these former boyscouts have had trouble with the law. Conduct a p-value hypothesis test of this claim at the \(\alpha = 0.01\) significance level.
Recall that the hypotheses are:
The test statistic under the null hypothesis above is:
Now this is a left-tailed test because the alternative hypothesis involves \(\lt\text{.}\) Therefore, the p-value of our test statistic is the probability of being further into the left tail than 1.12. Note, however, that 1.12 is on the right of the mean of 0 as shown below. All this means is that not only is there no evidence tosupport the claim, but the evidence seems to point to the opposite being true—that is to more than 10% being in trouble with the law.
Finishing this test, the p-value is \(P(Z \lt 1.12) = 0.8686\text{.}\) This is clearly much larger than the significance level of \(\alpha = 0.01\text{.}\) We therefore fail to reject the null hypothesis. There is not even evidence tending towards significance that fewer than 10% of boyscout members have trouble with the law before their 20th birthday.
Checkpoint 5.3.35.
A self-proclaimed psychic claims that he can predict the flipping of a coin. To test his claim you flip a coin 120 times and have him attempt to predict the outcome. He successfully predicts 71 of the flips. Based on this sample, you perform a hypothesis test, testing the following claims:
Question: what is the p-value of the test statistic for this hypothesis test?
0.0222
Checkpoint 5.3.36.
A politician claims that a majority of voters support his stance on a certain issue. To verify this claim, he has his staff contact 500 voters and finds that 258 of them support his position.
Question: what is the p-value of the test statistic for this problem?
0.2358
Subsection 5.3.5 Type I and Type II Errors
¶Recall that part of the hypothesis testing process is the possibility of making errors. These errors come in two types, as outlined below.
-
Type I Error.
This is the error of rejecting a null hypothesis even though it is true.
-
Type II Error.
This is the error of failing to reject the null hypothesis even though it is in fact wrong.
Let's look back at two of our examples of hypothesis tests for proportions and see what these errors might look like in those cases.
Example 5.3.37. Detecting a Type II Error.
A pharmaceutical company has developed a new drug for the common cold. They claim that this drug shortens the duration of the cold in 80% of individuals. A physician believes that this proportion is inflated. He finds 115 individuals who have just developed a cold, administers the drugs, and observes that 89 of them recover more quickly than is typical. If you conduct a hypothesis test at the \(\alpha = 0.05\) significance level using this sample, and in actuality the drug is only effective in 75% of patients, what error will be made?
In both the traditional and p-value test we failed to reject the null hypothesis. If, however, the drug only works in 75% of individuals, then the null hypothesis is false. Failing to reject a false null hypothesis is a type II error.
Example 5.3.38. Detecting a Type I Error.
A government official claims that 50% of cars on the road have under-inflated tires, and are therefore getting less-than-optimal gas mileage. To test this claim, a random sample of 600 cars is selected and 279 of them are found to have under-inflated tires. If a hypothesis test is conducted at the \(\alpha = 0.10\) significance level to test the government officials claim, and the true proportion of cars with under-inflated tires actually is 0.50, what type of error has been made?
For this example, we rejected the null hypothesis. If in fact it is true that \(p = 0.50\text{,}\) then we made a type I error by rejecting a true null hypothesis.
Checkpoint 5.3.41.
A self-proclaimed psychic claims that he can predict the flipping of a coin. To test his claim you flip a coin 120 times and have him attempt to predict the outcome. He successfully predicts 71 of the flips. Based on this sample, you perform a hypothesis test and conclude that the man does have skill in predicting coin flips.
Question: if in fact the probability the “psychic” can predict a coin flip is 0.5. What type of error have you made?
Type I Error
Checkpoint 5.3.42.
A politician claims that a majority of voters support his stance on a certain issue. To verify this claim, he has his staff contact 500 voters and finds that 258 of them support his position. Based on this sample, you find no evidence that the majority of voters support his position. However, in actuality, 51% support the politicians position.
Question: what type of error did you make?
Type II Error