Sampling Distributions and the Central Limit Theorem

Section 3.5 Sampling Distributions and the Central Limit Theorem

What Are Sampling Distributions?

Starting with our next section, we are going to be looking at how to make inferences (i.e. estimations or decisions) about a population based on what we observe in a sample. It is therefore important that we understand how sampling works, and in particular, how the sample mean and sample proportions work. Recall that the mean of s sample containing values \(x_1, x_2, \ldots, x_n\) is computed as follows.

\begin{equation*} \overline{x} = \frac{x_1+x_2+\cdots+ x_n}{n} \end{equation*}

This formula should look familiar from Subsection 1.3.4. We restate it here, however, to consider a question. If we select a sample of \(n\) values and look at their average over and over again, will we see more or less variation between these averages than we would if we look at the individual values in the population? We shall see that there is in fact less variation between the means of different samples than between individual values. This is because it is a lot harder to get a group of \(n\) individuals who all vary in the same way from the population average than it is to get a single individual who varies from the population average.

Besides taking the average of some variable in a sample, we may also wish to examine the proportion of individuals in a sample who have some characteristic. This is similar to the binomial processes we studied in Section 3.2. The only real difference is that instead of looking at \(x\text{,}\) the number of successes, we look at \(\frac{x}{n}\text{,}\) the proportion of successes.

Definition 3.5.1.

If a sample of size \(n\) is drawn from a population and \(x\) of those \(n\) individuals have some characteristic, then the sample proportion with that characteristic is:

\begin{equation*} \hat p = \frac{x}{n}\text{.} \end{equation*}

In this section, we will try to develop our understanding of how these sample statistics, \(\overline x\) and \(\hat p\) work. In particular, we will introduce an extremely important theorem called the central limit theorem.

Objectives

After finishing this section you should be able to

describe the following terms:
- central limit theorem
- criteria for applying the central limit theorem
- sample mean
- sample proportion
- sampling distribution for a mean
- sampling distribution for a proportion
accomplish the following tasks:
- Understand the concept of a distribution of sample means
- Understand the central limit theorem
- Use the sampling distribution of the mean to determine the likelihood of a given sample mean or range of sample means
- Use the sampling distribution for a proportion

Subsection 3.5.1 Sample Means

To better understand how means of samples behave, let's take a look at a sampling example.

Example 3.5.2. Finding the Sampling Distribution of the Mean.

You wish to draw samples of size three from a finite population of values \(\lbrace 0, 3, 6, 9, 12 \rbrace\text{.}\)

What are the possible samples of size three and what are their sample means?
If \(\overline{X}\) is a random variable whose value is the mean of the sample, give the probability distribution for \(\overline{X}\text{.}\)

Solution

Our first task is to list all possible samples of size three and compute their means. We know that there are \(C(5,3) = 10\) of them, and they are:

Sample	Sample Mean (\(\overline x\))
\(\lbrace 0, 3, 6 \rbrace\)	\(3\)
\(\lbrace 0, 3, 9 \rbrace\)	\(4\)
\(\lbrace 0, 3, 12 \rbrace\)	\(5\)
\(\lbrace 0, 6, 9 \rbrace\)	\(5\)
\(\lbrace 0, 6, 12 \rbrace\)	\(6\)
\(\lbrace 0, 9, 12 \rbrace\)	\(7\)
\(\lbrace 3, 6, 9 \rbrace\)	\(6\)
\(\lbrace 3, 6, 12 \rbrace\)	\(7\)
\(\lbrace 3, 9, 12 \rbrace\)	\(8\)
\(\lbrace 6, 9, 12 \rbrace\)	\(9\)

Table 3.5.3. Samples and their Means

Our final task is to construct a probability distribution for \(\overline X\text{,}\) the sample means. This is done by counting the number of samples that have each value and dividing by the total of \(10\) possible samples. While we are at this, we'll also find the expected value of \(\overline X\text{.}\)

\(\overline x\)	\(P\left(\overline X = \overline x\right)\)	\(\overline x \times P\left(\overline X = \overline x\right)\)
\(3\)	\(\frac{1}{10}\)	\(\frac{3}{10}\)
\(4\)	\(\frac{1}{10}\)	\(\frac{4}{10}\)
\(5\)	\(\frac{2}{10}\)	\(\frac{10}{10}\)
\(6\)	\(\frac{2}{10}\)	\(\frac{12}{10}\)
\(7\)	\(\frac{2}{10}\)	\(\frac{14}{10}\)
\(8\)	\(\frac{1}{10}\)	\(\frac{8}{10}\)
\(9\)	\(\frac{1}{10}\)	\(\frac{9}{10}\)
\(E(\overline{X}) =\)		\(6\)

Table 3.5.4. Distribution of Sample Means

Note that several interesting things have happened. First, when we take the mean of all of our possible samples, we see repetition. That is, more than one sample produces the values \(5\text{,}\) \(6\text{,}\) and \(7\) as a mean. Consider the graphs below of the individual values from the population, which is uniform, versus the sample means, which is starting to look more mound shaped.

(a) Individual Values

(b) Sample Means

Figure 3.5.5. Distributions Based on \(\lbrace 0, 3, 6, 9, 12\rbrace\)

Also notice that the distribution of sample means has less variation than the distribution of individual values from the population. Another interesting thing that has happened is that the expected value of \(\overline X\) is \(6\text{,}\) which is exactly the population mean. That is, when we take the mean of the sample means, the average of the averages, we get the same thing as the mean of the entire population.

These two factors combined will play an important role on the next page, where we introduce one of the most important theorems in statistics.

Figure 3.5.6. Distribution of Sample Means I

Figure 3.5.7. Distribution of Sample Means II

Checkpoint 3.5.8.

Samples of size three are drawn from the finite population \(\lbrace 1, 3, 5, 7, 9 \rbrace\) without replacement.

Question: what is the mean of the sampling distribution for this sample?

Section 3.5 Sampling Distributions and the Central Limit Theorem

What Are Sampling Distributions?

Definition 3.5.1.

Objectives

Subsection 3.5.1 Sample Means

Example 3.5.2. Finding the Sampling Distribution of the Mean.

Checkpoint 3.5.8.

Checkpoint 3.5.9.

Checkpoint 3.5.10.

Subsection 3.5.2 The Central Limit Theorem

Theorem 3.5.11. Central Limit Theorem.

Principle 3.5.12. Criteria for Applying the Central Limit Theorem.

Example 3.5.13. Determining if We Can Apply the Central Limit Theorem I.

Example 3.5.14. Determining if We Can Apply the Central Limit Theorem II.

Checkpoint 3.5.17.

Checkpoint 3.5.18.

Checkpoint 3.5.19.

Subsection 3.5.3 Sampling Distribution of the Mean

Theorem 3.5.20. Sampling Distribution for a Mean.

Example 3.5.21. Applying the Sampling Distribution for a Mean.

Example 3.5.23. Comparing Probabilities for Different Sample Sizes.

Checkpoint 3.5.27.

Checkpoint 3.5.28.

Checkpoint 3.5.29.

Subsection 3.5.4 Sample Distribution of a Proportion

Example 3.5.30. Computing a Sample Proportion.

Theorem 3.5.31. Sampling Distribution for a Proportion.

Example 3.5.32. Computing Probabilities for a Sample Proportion.

Checkpoint 3.5.35.

Checkpoint 3.5.36.

Checkpoint 3.5.37.

Example 3.5.38. Computing Probabilities for a Sample Proportion I.

Example 3.5.39. Computing Probabilities for a Sample Proportion II.

Checkpoint 3.5.42.

Checkpoint 3.5.43.

Checkpoint 3.5.44.