Confidence Intervals for a Mean

Section 4.2 Confidence Intervals for a Mean

Estimating a Mean.

We have now seen the basics of constructing a confidence interval. To review, these steps are:

find a point estimate,
find the margin of error, and
add the margin of error to and subtract it from the point estimate.

In this section we will learn how to complete these steps when the parameter we are estimating is the mean, \(\mu\text{,}\) of a single population which is either normal or from which we've taken a large enough sample that we can apply the Central Limit Theorem. As we work through this process, we will give formulas for the point estimate, standard error, and finally the margin of error that are used to construct such a confidence interval. We will also remind you of the assumptions which must be made in order to use a normal distribution and these formulas.

The margin of error computation involves a critical value. Since we will be using a normal distribution in most of our computations, we should look the value of \(z_{\alpha/2}\) up in the table every time we need it. However, if we know the critical values for some of the more common confidence levels, that will save us some time. In Appendix A.1 you can find a quick reference table that can be useful for finding the more common critical values.

Objectives

After finishing this section you should be able to

describe the following terms:
- Confidence Interval for a Mean
- Margin of Error for a Mean
- point estimator for a mean
- Sample Size when Estimating a Mean
accomplish the following tasks:
- Identify the best point estimate for a population mean
- Find the margin of error for a confidence interval for a mean
- Construct a confidence interval for a population mean
- Understand and list the assumptions that must be made to construct this confidence interval
- Compute the minimum sample size necessary for a given margin of error

Subsection 4.2.1 Point Estimate

Since the goal of this section is to understand how to estimate a population mean, it makes sense to start by asking what the best point estimate is for a mean. That is, if we draw a random sample from a population, and we want to compute one number from that sample that best approximates the mean of the population, what number should we use?

Example 4.2.1. Computing a Point Estimate.

In order to estimate the mean weight of a widget, a manufacturing facility randomly selects 10 widgets and finds the following weights.

\begin{equation*} \lbrace 10, 13, 12, 9, 10, 14, 10, 12, 11, 9 \rbrace \end{equation*}

What one number should they use to best approximate the mean weight of a widget?

Solution

This is not a trick question. The answer is very straight forward if you think back to Section 1.3. The best point estimate for a population mean is the mean of the sample. In this case, they should use:

\begin{equation*} \overline{x} = \frac{10+13+12+9+10+14+10+12+11+9}{10} = 11\text{.} \end{equation*}

The results of this example are summarized below.

Theorem 4.2.2. Point Estimator for a Mean.

The best point estimator for a mean is the sample mean, \(\overline{x}\text{.}\)

While this may seem obvious, there is some background knowledge that is important. In Section 3.5 we saw that as long as the population was normal, or the sample size was at least 30, then the distribution of sample means, \(\overline{X}\text{,}\) will be approximately normal and have as its mean \(\mu_{\overline{x}}\) the population mean \(\mu\text{.}\) At this point, we don't need to know that the sample means have a normal distribution, although that will be important shortly. What is more important is that

\begin{equation*} \mu_{\overline{x}} = \mu\text{.} \end{equation*}

In other words, if we take the average of the sample means \(\overline{x}\) for all possible samples of a set size, that will equal the population mean. So \(\overline{x}\) is an unbiased estimator for \(\mu\text{.}\)

Note that this does not by any means guarantee that \(\overline{x} = \mu\text{.}\) Our sample mean will probably not equal the actual population mean. But it should be consistently close. And if we were to take many different samples, the average of all of their means should be extremely close to the population mean which we are trying to estimate.

Figure 4.2.3. Point Estimates for a Mean I

Figure 4.2.4. Point Estimates for a Mean II

Checkpoint 4.2.5.

To estimate the mean chewing speed of a rabbit, which you believe to be normally distributed, you take a sample of 5 rabbits and find that their chewing speeds are: 3, 4.2, 2.5, 3.3, and 2.7 bites per second.

Question: what one value should you use as a point estimate for \(\mu\text{,}\) the mean chewing speed of rabbits? Round your answer to two decimal places.

Answer

3.14

Checkpoint 4.2.6.

In order to estimate the average number of fumbles in an NFL football game, you watch three different games on Sunday and find that there are 3, 5, and 4 fumbles in those games.

Question: what one number should be used as a point estimate for the mean number of fumbles in an NFL game?

Answer

Subsection 4.2.2 Margin of Error

Now that we know the best point estimator for a population mean is the sample mean, we have the first piece of our confidence interval. But we need to know how much error we should expect when using this approximation. That is, we need to know how to compute the margin of error. Remember that in general,

\begin{equation*} \text{ ME } = z_{\alpha/2} \times \text{ SE } \end{equation*}

SE, where the standard error is the standard deviation in the sampling distribution. From Section 3.5, we know that if the distribution of \(\overline{x}\) is normal, then its standard deviation is \(\frac{\sigma}{\sqrt{n}}\text{.}\) This yields the following.

Theorem 4.2.7. Margin of Error for a Mean.

When approximating a single population mean \(\mu\) with a sample mean \(\overline x\text{,}\) the margin of error is:

\begin{equation*} z_{\alpha/2} \times \left(\frac{\sigma}{\sqrt{n}}\right)\text{.} \end{equation*}

There are several things to note about this margin of error computation.

The distribution of the sample means must be normal. That means either the original population must have a normal distribution or we must have a sample size of at least 30 so that we can apply the central limit theorem.
We must know the value of \(\sigma\text{,}\) the population standard deviation. If we are trying to estimate the population mean, it seems unreasonable that we would already know \(\sigma\text{.}\) However, as long as our sample size is at least 30, we can use the sample standard deviation s as an approximation for \(\sigma\text{.}\)

With these notes in mind, let's look at several examples.

Example 4.2.8. Finding the Margin of Error for a 95% Confidence Interval.

In order to estimate the average number of Skittles in a snack-sized bag, you collect a sample of 50 bags and find that the average of your sample is \(\overline{x} = 14.3\) candies with a standard deviation of \(s = 1.65\) candies. What do you estimate the average number of candies is in the population of all snack-sized Skittles bags, and what is the 95% margin of error for this estimate?1

Solution

The best point estimate for \(\mu\) is \(\overline{x}\text{.}\) We therefore estimate that there are 14.3 candies in each bag. How good is this estimate? Recalling that the critical value from a normal distribution at the 95% confidence level is \(z_{\alpha/2} = \pm 1.96\text{,}\) we get the following margin of error.

\begin{equation*} z_{\alpha/2} \times \left(\frac{\sigma}{\sqrt{n}}\right) = \pm 1.96\left(\frac{1.65}{\sqrt{50}}\right) \approx \pm 0.457 \end{equation*}

Therefore the margin of error in our estimate is \(\pm 0.45\) candies.

Example 4.2.9. Finding the Margin of Error for a 99% Confidence Interval.

A biologist wishes to estimate the average length of a certain type of fish in a secluded lake. She knows that the length of this species of fish is normally distributed with a standard deviation of \(\sigma = 2.7\) inches. After collecting a sample of 8 randomly selected fish, she finds that the sample mean is \(\overline{x} = 14.1\) inches. She therefore claims that the population of fish in this lake has an average length of 14.1 inches. If we wish to be 99% confident, what is the margin of error in her estimation?

Solution

Using the formula for the margin of error with \(\alpha = 0.01\text{,}\) so that \(z_{\alpha/2} = 2.575\text{,}\) we get:

\begin{equation*} z_{\alpha/2}\times \left(\frac{\sigma}{\sqrt{n}}\right) = \pm 2.575\left(\frac{2.7}{\sqrt{8}}\right) \approx \pm 2.458\text{.} \end{equation*}

Therefore, the estimated length of 14.1 inches has a margin of error of \(\pm 2.458\) inches.

Figure 4.2.10. Margin of Error I

Figure 4.2.11. Margin of Error I

Checkpoint 4.2.12.

In order to estimate the mean length of walrus tusks, a sample of 35 tusks are measured and their mean is found to be \(\overline{x} = 29.5\) inches with a standard deviation of \(s = 3.6\) inches. The sample mean 29.5 is used as a point estimate for the population mean \(\mu\text{.}\)

Question: what is the margin of error at the 95% confidence level for this point estimate? Round your answer to three decimal places.

Answer

\(\pm 1.193\)

Checkpoint 4.2.13.

Scores on a certain standardized test are normally distributed with a standard deviation of \(\sigma = 121.4\) points. To estimate the mean score on the test, a sample of 100 scores is collected. The average of these 100 scores is found to be 1351.5.

Question: if we use \(\overline{x} = 1351.5\) as a point estimate for \(\mu\text{,}\) what is the margin of error at the 99% confidence level? Round your answer to three decimal places.

Answer

31.261

Subsection 4.2.3 Confidence Interval

By simply adding and subtracting the margin of error to the point estimate, we can construct our confidence interval. For a mean, this is done as follows.

Theorem 4.2.14. Confidence Interval for a Mean.

The \((1-\alpha)100\%\) confidence interval for a single population mean \(\mu\) is given by:

\begin{equation*} \overline{x} \pm z_{\alpha/2}\times \left(\frac{\sigma}{\sqrt{n}}\right)\text{.} \end{equation*}

Consider the following examples.

Example 4.2.15. Constructing Confidence Intervals for a Mean.

To estimate the average weight of a Taco Bell bean burrito, you collect a sample of 40 bean burritos. You find that the sample mean is \(\overline{x} = 7.4\) oz with a standard deviation of \(s = 1.42\) oz. Construct the following.

A 95% confidence interval for the mean weight of a Taco Bell bean burrito
A 99% confidence interval for the mean weight of a Taco Bell bean burrito

Solution

The process with be almost identical for these to intervals. Only the critical values used will differ.

For the 95% interval, we use \(z_{\alpha/2} = \pm 1.96\) and get

\begin{equation*} \overline{x} \pm z_{\alpha/2}\times \left(\frac{\sigma}{\sqrt{n}}\right) = 7.4 \pm 1.96\left(\frac{1.42}{\sqrt{40}}\right) = 7.4 \pm 0.440\text{.} \end{equation*}

Therefore, the 95% confidence interval for \(\mu\text{,}\) the mean weight of a Taco Bell bean burrito, is:

\begin{equation*} 6.960 \lt \mu \lt 7.840\text{.} \end{equation*}
Now for a 99% confidence interval, the only thing we change is the value of \(z_{\alpha/2}\text{.}\)

\begin{equation*} \overline{x} \pm z_{\alpha/2}\times \left(\frac{\sigma}{\sqrt{n}}\right) = 7.4 \pm 2.58\left(\frac{1.42}{\sqrt{40}}\right) = 7.4 \pm 0.579\text{.} \end{equation*}

Therefore, the 99% confidence interval for \(\mu\text{,}\) the mean weight of a Taco Bell bean burrito, is:

\begin{equation*} 6.821 \lt \mu \lt 7.979\text{.} \end{equation*}

In the above example, note what happened when we increased our confidence level. If we only want to be 95% confident that our confidence interval will contain the true mean weight of a burrito, we get a smaller interval than if we want to be 99% confident. Whenever we increase the confidence level, this will also increase the width of our confidence intervals. At the extremes, we can be 0% confident, in which case our interval is just the point estimate \(\overline{x}\text{.}\) We can also be 100% confident, in which case our confidence interval extends infinitely in both directions. Neither of these is practical, so we settle for “statistically significant” confidence levels such as 95% and 99%.

We can also construct upper or lower confidence bound. These one-sided confidence intervals work in much the same way as the two-tailed confidence interval above, except that we use \(z_\alpha\) instead of \(z_{\alpha/2}\) for our critical value.

Example 4.2.16. Constructing a One-Sided Confidence Interval.

You wish to place an upper bounds on the average waiting time in line at McDonald's. To do this, you take a random sample of 70 different waiting times for customers at McDonald's. You find that the average waiting time in the sample is \(\overline{x} = 3.4\) minutes with a standard deviation of \(s = 1.97\) minutes. With 99% confidence, what is the largest that the average waiting time could be?

Solution

In this setting, we really don't care how short the waiting time is. People are, after all, happier if they get through line more quickly. Instead, we want to place an upper bound on the waiting time. We want to be able to say that the average waiting time is “no more than” a certain amount.

To do this, we use the formula for a 99% upper confidence bound.

\begin{equation*} \overline{x} + z_\alpha\times \left(\frac{\sigma}{\sqrt{n}}\right) = 3.4 + 2.33\left(\frac{1.97}{\sqrt{70}}\right) = 3.948 \end{equation*}

Therefore, the upper confidence bound for the average waiting time at the 99% confidence level is 3.948 minutes.

Figure 4.2.17. Confidence Intervals

Figure 4.2.18. Confidence Intervals

Figure 4.2.19. Confidence Intervals

Checkpoint 4.2.20.

A band wishes to estimate the average number of fans at their concerts using a 98% confidence interval. They count fans at a sample of 35 different concerts and find that the sample mean is \(\overline{x} = 83.4\) fans with a standard deviation of \(s = 17.85\) fans.

Question: What is the confidence interval for the true mean number of fans at a concert? Round your confidence bounds to two decimal places.

Answer

\(76.37 \lt \mu \lt 90.43\)

Checkpoint 4.2.21.

The mean weight of a fully loaded hay truck follows a normal distribution with standard deviation \(\sigma = 29.4\) pounds. To estimate this mean weight, 47 different hay trucks are weighted and the sample mean for these trucks is found to be \(\overline{x} = 4973.2\) pounds. You wish to construct a 95% confidence interval.

Question: what is this interval, with the bounds rounded to two decimal places?

Answer

\(4964.79 \lt \mu \lt 4981.61\)

Subsection 4.2.4 Sample Size

In many instances we may want to design our sample to get a particular width for our confidence interval. That is, we decide ahead of time what we want the margin of error to be, and then we figure out how big our sample needs to be to get us that margin of error. To do this, we need to work with the formula for the margin of error. Suppose we want our margin of error to be no more than some upper bound \(E\text{.}\) That is,

\begin{equation*} z_{\alpha/2}\times \left(\frac{\sigma}{\sqrt{n}}\right) \leq E\text{.} \end{equation*}

To solve this inequality for \(n\text{,}\) the sample size, we start by squaring both sides. This gives:

\begin{equation*} \frac{(z_{\alpha/2})^2\sigma^2}{n} \leq E^2\text{.} \end{equation*}

Then multiplying both sides by \(n\) and dividing by \(E^2\) gives us the final inequality:

\begin{equation*} n \geq \frac{(z_{\alpha/2})^2\sigma^2}{E^2}\text{.} \end{equation*}

Again, we don't usually know the value of \(\sigma\) before we take a sample. To get around this, we can either use a value of \(\sigma\) from some previous study, or we can estimate the standard deviation.

Theorem 4.2.22. Sample Size when Estimating a Mean.

To get a maximum margin of error of \(E\text{,}\) at the \((1-\alpha)\) confidence level, we must take a sample of size \(n\) where:

\begin{equation*} n \geq \frac{(z_{\alpha/2})^2\sigma^2}{E^2}\text{.} \end{equation*}

To see this in action, consider the following example.

Example 4.2.23. Finding the Sample Size for a Given Margin of Error.

You wish to construct a 95% confidence interval for the mean height of a population of rose bushes. You want your margin of error to be no more than \(\pm 6\) inches. If you know that the maximum height obtained by a rose bush is 73 inches, and the minimum is 8 inches, how many rose bushes should you include in your sample?

Solution

Since we are told the range of rose bush heights, we will use the approximation

\begin{equation*} \sigma \approx \frac{73 - 8}{4} = \frac{65}{4} = 16.25 \end{equation*}

for the standard deviation. Plugging this, together with the critical value \(z_{\alpha/2} = 1.96\text{,}\) into the inequality, we get:

\begin{equation*} n \geq \frac{(1.96)^2(16.25)^2}{(6)^2} \approx 28.18\text{.} \end{equation*}

Since we can not sample a fraction of a rose bush, we round up to get \(n = 29\text{.}\) Note that we should always round up, even though the decimal is not \(0.5\) or greater. If we were to round down, we would get a margin of error that is larger than the \(\pm 6\) that was specified.

In the above example, it would probably be a good idea to use \(n = 30\) as the sample size, since then we can apply the central limit theorem and treat this as a normal distribution. We can always use a larger sample that this formula indicates. That will only lead to better results.

Figure 4.2.24. Sample Size I

Figure 4.2.25. Sample Size II

Checkpoint 4.2.26.

You wish to estimate the mean length of a certain breed of caterpillar using a 98% confidence interval. A previous study has shown that these caterpillars have a normally distributed length with standard deviation \(\sigma = 14\) mm.

Question: what is the smallest sample size you can use to get a margin of error of less than 1 mm?

Answer

1065

Checkpoint 4.2.27.

You wish to estimate the average number of cars your local Walmart parking lot at noon on a weekday using a 95% confidence interval. You decide to assume that the number of cars has a normal distribution, and estimate \(\sigma\) using the minimum and maximum number of cars that you have observed, which were 194 and 82 respectively.

Question: how many days must you observe in order to get a margin of error of less than 5 cars?

Answer

121