Section 3.4 Normal Approximation to the Binomial Distribution
¶Approximating the Binomial Distribution.
In this section we will see how the normal distribution can be used to approximate probabilities from the binomial distribution. It may seem strange that we would want to approximate binomial probabilities. After all, we can compute their exact value using the binomial probability formula.
To see why an approximation may be useful, consider the following example.
Example 3.4.1. Recognizing a Complex Binomial Probability Computation.
A recent study has determined that 32.2% of Americans are obese. A research group wishing to study this phenomena samples \(12,000\) individuals in a large metropolitan area. Describe how to find the probability that no more than \(3750\) of these individuals are obese, but do not perform the actual computation.
In order to find this probability using the binomial probability formula above, we need to find the sum:
This involves 3751 instances of the binomial probability formula, and would take a lot of time.
If it were possible, we would certainly be interested in being able to approximate the sum above if it can save us from performing 1201 separate computations. Even using a computer, this process would be time consuming. In this section we will learn when we can use the normal distribution to approximate the binomial distribution, as well as how to carry out that approximation.
Objectives
After finishing this section you should be able to
-
describe the following terms:
continuity correction
criteria for approximation
normal approximation to the binomial distribution
-
accomplish the following tasks:
Determine if it is appropriate to use the normal approximation
Correctly apply the continuity correction
Use the normal distribution to approximate binomial probabilities
Subsection 3.4.1 Visualizing the Binomial Distribution
¶Before we even start talking about how we can approximate binomial probabilities using the normal distribution, let's think a little about why we can. Below are three probability histograms for a binomial random variable \(X\) resulting from \(n = 10\) trials. The first shows the distribution of \(X\) when \(p = 0.1\text{,}\) the middle when \(p = 0.5\text{,}\) and the right when \(p = 0.9\text{.}\)
Which of these distributions would we call mound–shaped? The one in the middle appears to be the most mound-shaped of the three. The other two are skewed either to the right or to the left. Note that the one in the middle has a probability of \(0.5\text{.}\) The binomial distribution looks the most like the normal distribution when \(p = 0.5\text{.}\) However, as \(n\) increases, the value of \(p\) becomes less important. Consider the distributions below with the same values of \(p\text{,}\) but with \(n = 80\text{.}\)
Notice that with the larger value of \(n\text{,}\) all three of these probability histograms look pretty mound shaped. Also notice that as \(n\) increases, the number of bars increases as well, and the distribution of probabilities starts to look less stair-stepped, and more like a smooth curve. Try playing with this yourself by performing the following steps.
Open the interactive binomial distribution page.
Change the value of \(p\) (in the bottom right-hand corner) to several different percents to see what happens (for example, try 25, 50, and 75).
Change the value of \(n\) (in the bottom left-hand corner) to several different numbers to see what happens (for example, try \(n=10\text{,}\) \(30\text{,}\) \(50\text{,}\) and so on).
Try different combinations of \(n\) and \(p\) and notice how mound-shaped or skewed the distribution looks.
Finally, click the “Show Normal Curve” button to see how the normal curve “fits” on top of the binomial probability histogram.
Hopefully you have noticed that the larger \(n\) is and the closer \(p\) is to \(0.5\text{,}\) the less “gap” there is between the normal curve and the bars. That is, the less of the bar sticks up above, or does not reach up to the normal curve. The smaller this “gap” is, the better our approximation will be.
Checkpoint 3.4.5.
Let \(X\) be a binomial random variable with \(n\) trials and a probability of success \(p\text{.}\)
Question: which values for \(n\) and \(p\) will produce the most mound-shaped probability histogram?n
\(n=500, p=0.5\)
\(n=10, p=0.5\)
\(n=100, p=0.2\)
\(n=50, p=0.85\)
(a)
Checkpoint 3.4.6.
The shape of a binomial distribution histogram depends not only on the value of \(p\text{,}\) but also on the size of \(n\text{.}\) In fact, as the size of \(n\) increases, the distribution looks less “stair-stepped” and more like a smooth probability density curve.
Question: is the above statement true or false?
true
Checkpoint 3.4.7.
Let \(X\) be a binomial random variable with \(n\) trials, probability of success \(p\text{,}\) and a probability of failure \(q = 1-p\text{.}\)
Question: which of the following will make the probability histogram for \(X\) more mound–shaped?
Making \(n\) larger
Making \(n\) smaller
Making \(p\) closer to \(1\)
Making \(p\) closer to \(0\)
Making \(q\) closer to \(0.5\)
Making \(q\) closer to \(1\)
(a) and (e)
Subsection 3.4.2 When can We Approximate?
¶If then the binomial distribution can have so many different shapes depending on the parameters \(n\) and \(p\text{,}\) when is is enough like the mound–shaped normal distribution to allow us to use the normal distribution to approximate probabilities?
Principle 3.4.8. Criteria for Approximation.
A normal distribution can be used to approximate binomial probabilities as long as both \(n\times p\) and \(n\times q\) are greater than \(5\text{.}\)
Of course the larger \(n\times p\) and \(n\times q\) get, the better the approximation will be. However, the criteria above tells us when an approximation will be “good enough” for us to use. Notice that this criteria has nothing to do with the specific probability that we want to compute. It doesn't matter if we are looking for the probability that \(X\) is greater than a number, less than a number, or between two numbers. What matters is how large \(n\times p\) and \(n\times q\) are. This is because, as we saw on the last page, these two parameters control the shape of the binomial probability histogram, and this histogram is what either matches a normal distribution well or does not match well.
Example 3.4.9. Determining if We Can Approximate.
A binomial random variable \(X\) is the result of \(50\) trials in which the probability of a success is \(0.93\text{.}\) We wish to approximate the probability that \(X\) is at least \(30\text{.}\) Can we do this using a normal distribution?
To answer this we check both \(n\times p\) and \(n\times q\text{.}\)
\(n\times p = 50(0.93) = 46.5\) which is greater than \(5\text{,}\) so we are okay here.
However, \(n\times q = 50(1-0.93) = 50(0.07) = 3.5\text{,}\) which is not greater than \(5\text{.}\) Therefore, we can not use a normal distribution to approximate this binomial probability.
Example 3.4.10. Determining a Minimum Number of Trials to Approximate.
A factory has determined that its manufacturing process produces bad widgets 0.5% of the time. Suppose that they wish to take a sample of \(n\) widgets to run quality control tests. How many widgets must they sample before they can use a normal approximation to get probabilities?
We must ensure that \(n\times p\) and \(n\times q\) are both greater than \(5\text{.}\)
-
To get \(n\times p > 5\text{,}\) we solve
\begin{equation*} n(0.005) > 5 \Rightarrow n > 1000\text{.} \end{equation*} -
To get \(n\times q > 5\text{,}\) we solve
\begin{equation*} n(0.995) > 5 \Rightarrow n > 5.02\text{.} \end{equation*}
Taking the larger of these two, the factory must sample at least 1001 widgets to be able to use the normal approximation to compute probabilities.
Checkpoint 3.4.13.
A binomial random variable \(X\) comes from a process with \(n=20\) trials in which the probability of a success is \(p=0.15\text{.}\)
Question: can we use a normal distribution to approximate probabilities for \(X\text{?}\)
No
Checkpoint 3.4.14.
A binomial random variable \(X\) comes from a process with \(n=20\) trials in which the probability of a success is \(p=0.8\text{.}\)
Question: can we use a normal distribution to approximate probabilities for \(X\text{?}\)
No
Checkpoint 3.4.15.
A binomial random variable \(X\) comes from a process with \(20\) trials in which the probability of a success is \(p=0.60\text{.}\)
Question: can we use a normal distribution to approximate probabilities for \(X\text{?}\)
Yes
Subsection 3.4.3 Continuity Correction
¶There is one final issue to address before we are ready to start approximating binomial probabilities with the normal distribution. This issue has to do with the fact that we are using a continuous probability density curve to approximate a discrete random variable. To see why this may be a problem, consider the following picture. Suppose that the binomial random variable \(X\) (shown by the bar) is being approximated using a normal random variable \(Y\) (shown by the curve).
In the normal distribution, the probability \(P(Y=10)\) is exactly zero because it is a single line. In the binomial distribution, however, the entire green bar represents \(P(X=10)\text{.}\) If we wish to use the normal distribution to find \(P(X=10)\) in the binomial distribution, we need to translate this bar into a range of \(Y\) values. That range will extend from the bottom end of the bar, which is at \(10 - 0.5\text{,}\) to the top end of the bar at \(10 + 0.5\text{.}\) Therefore,
When we change a whole number into a range like this we are correcting for the fact that we use a continuous random variable in place of a discrete random variable.
Definition 3.4.17.
When we add or subtract \(0.5\) to a whole number as we approximate a binomial probability using a normal probability distribution, we are using a continuity correction.
In the following examples, we will apply the continuity correction to translate a probability statement for a discrete random variable \(X\) into an approximately equivalent statement for a continuous random variable \(Y\text{.}\)
Example 3.4.18. Applying a Continuity Correction.
A binomial random variable \(X\) is to be approximated by a normal random variable \(Y\text{.}\) Convert each of the probability statements about a value or range of values for \(X\) into a statement about an approximately equivalent range of values for \(Y\text{.}\)
\(P(X \gt 26)\)
\(P(X \leq 60)\)
\(P(19 \leq X \lt 24)\)
To help us make the translation, we will draw example pictures for each one of these ranges. We will indicate the associated area under the normal curve using diagonal blue lines.
-
\(P(X \gt 26)\).
We need to take the bar for 26 in the binomial distribution, and shade everything above that bar but not including the bar itself. So using the top of the bar, which is 26.5, we translate this into \(P(Y > 26.5)\text{.}\)
Figure 3.4.19. -
\(P(X \leq 60)\).
In this example, we want to shade everything less than and including the bar for 60. Since we want to include that bar and everything below, we use the range \(P(Y \lt 60.5)\text{.}\)
Figure 3.4.20. -
\(P(19 \leq X \lt 24)\).
For the lower limit, we want to include the 19 bar, so we subtract 0.5 and start with 18.5. For the upper limit we do not want to include the bar for 24, so we only go up to 24 - 0.5, or 23.5. This makes the range \(P(18.5 \lt Y \lt 23.5)\text{.}\)
Figure 3.4.21.
One caution on using the continuity correction. It should only be used when we are approximating a binomial distribution with a normal distribution. If we start with a normal distribution, then the variable is already continuous, and no correction is needed.
Checkpoint 3.4.24.
A binomial random variable \(X\) is to be approximated by a normal random variable \(Y\) in order to find \(P(12 \lt X \leq 17)\text{.}\)
Question: which of the following is the correct range for \(Y\text{?}\)
\(P(11.5 \lt Y \lt 16.5)\)
\(P(11.5 \lt Y \lt 17.5)\)
\(P(12 \lt Y \lt 17)\)
\(P(12.5 \lt Y \lt 16.5)\)
\(P(12.5 \lt Y \lt 17.5)\)
(e)
Checkpoint 3.4.25.
A normal random variable \(Y\) is to be used to approximate \(P(X > 45)\) for a binomial random variable \(X\text{.}\)
Question: what is the approximately equivalent probability statement for \(Y\text{?}\)
\(P(Y>45.5)\)
Checkpoint 3.4.26.
When using a normal distribution to find \(P(a \lt X \lt b)\text{,}\) we should always use a continuity correction.
Question: is this statement true or false?
False
Subsection 3.4.4 Normal Approximations
¶tools in place to use a normal distribution to approximate probabilities for a binomial distribution. Recall that a binomial random variable has mean and standard deviation as shown below.
Theorem 3.4.27. Normal Approximation to the Binomial Distribution.
If \(X\) is a binomial random variable for a binomial process involving \(n\) trials in which the probability of a success in each trial is \(p\text{,}\) then probabilities for \(X\) can be approximated by a normal distribution with mean and standard deviation as shown below provided that \(n\times p\) and \(n\times q\) are both greater than \(5\text{.}\)
So when we use a normal distribution to approximate a binomial probability, the mean of that normal distribution will be \(\mu = n\times p\) and the standard deviation will be \(\sigma = \sqrt{n\times p\times q}\text{.}\) Putting this together with the criteria for approximation and the continuity correction, we can solve examples such as the following.
Example 3.4.28. Approximating a Binomial Probability Involving “At Least”.
A binomial process involves \(400\) trials in which the probability of a success is \(p = 0.35\text{.}\) What is the probability that there are at least \(168\) successes in this process?
We first must be sure that we can use a normal approximation. To assess this, we check \(n\times p\) and \(n\times q\text{.}\)
Since both of these are greater than \(5\text{,}\) we can continue with a normal approximation.
Next, we need to compute the mean and standard deviation to use for the normal distribution. We have actually already found the mean above, but we repeat this computation together with that of the standard deviation below.
Our final task is to use the normal distribution given by this mean and standard deviation to find \(P(X \geq 168)\text{.}\) We draw a picture to help us visualize the appropriate continuity correction, zooming in on the right tail of the distribution since 168 is above the mean of 140.
The picture and the z-score formula help us make the following computation.
Observe that in this question, we are actually using three different distributions. There is the binomial distribution for which we are actually wanting to find probabilities, represented by the variable \(X\text{.}\) Then there is the normal distribution we use to approximate it, represented by the variable \(Y\text{.}\) Finally, there is the standard normal distribution to which we convert in order to use the standard normal distribution table, represented by the variable \(Z\text{.}\) Our next example picks up where Example 3.4.1 left off.
Example 3.4.30. Approximating a Binomial Probability Involving “No More Than”.
A recent study has determined that 32.2% of Americans are obese. A research group wishing to study this phenomena samples \(12,000\) individuals in a large metropolitan area. What is the probability that no more than \(3750\) of these individuals are obese? Use a normal approximation.
Checking our criteria for approximating yields
Since both of these are greater than \(5\text{,}\) we may approximate this probability using a normal distribution.
That normal distribution will have a mean and standard deviation of
We note that “no more than 3750” means we want \(X\) to be less than or equal to 3750. So we draw the picture shown below to help us correctly apply the continuity correction, this time zooming in on the left tail since 3750 is less than the mean of 3864.
This gives us the following probability computation
Therefore, the probability of no more than \(3750\) obese individuals is approximately \(0.0132\text{.}\)
Checkpoint 3.4.35.
At a certain large hotel knows that about 7% of guests who make reservations on any given night will, for one reason or another, not show up to claim their room. Because of that, the hotel, which has 250 rooms, books a total of 260 reservations. Suppose that each of these 260 reservations can be treated as an independent Bernoulli trial.
Question: what is the probability that the hotel will be over-booked? Use a normal approximation to this binomial probability and give all four decimals of the probability from the standard normal distribution table.
0.0174
Checkpoint 3.4.36.
A binomial distribution has 500 trials and a probability of success \(p = 0.74\text{.}\) You wish to find \(P(X \lt 350)\text{.}\)
Question: what is the probability approximated by a normal distribution?
0.0183
Checkpoint 3.4.37.
A farmer knows that about 19% of his cherry crop will need to be used for juice, jams, or other products because the cherries will have split and be bruised. A large bin contains approximate 12,000 cherries. Suppose that inspecting each cherry can be thought of as an independent Bernoulli trial.
Question: what is the probability that a large bin contains more than 2400 bad cherries? Use a normal approximation to get this probability.
0.0026
Subsection 3.4.5 How Good are These Approximations?
¶We have made a point of identifying the probabilities we get from a normal distribution as approximations for the probabilities from a binomial distribution. Any time we are approximating, the question naturally arises, how good is that approximation? In the next several examples we will look at both the normal approximation and, with the help of a computer, the binomial probability to see how good these approximations really are.
Example 3.4.38. Approximating with Few Trials.
A baseball player gets a hit 63% of the time that he is at bat. Suppose that in a certain double header this player is at bat 14 times, and that these at-bats can be treated as a binomial process. Find the probability that he gets at least 10 hits using:
the binomial probability formula, and
a normal approximation.
From the problem statement, \(n = 14\text{,}\) \(p = 0.63\text{,}\) and \(q = 1-p=0.37\text{.}\) We want to compute \(P(X \geq 10)\text{.}\)
-
Using the binomial formula 3.2.34 yields:
\begin{align*} P(X \geq 10) \amp= P(X=10) + P(X=11) + P(X=12)\\ \amp\quad + P(X=13) + P(X=14)\\ \amp= C(14,10)(.63)^{10}(.37)^4 + C(14,11)(.63)^{11}(.37)^3\\ \amp\quad + C(14,12)(.63)^{12}(.37)^2+ C(14,13)(.63)^{13}(.37)^1\\ \amp\quad + C(14,14)(.63)^{14}(.37)^0\\ \amp\approx 0.1848 + 0.1144 + 0.0487 + 0.0128 + 0.0016\\ \amp= 0.3622\text{.} \end{align*} -
Now using a normal approximation, we first check \(n\times p\) and \(n\times q\text{.}\)
\begin{equation*} n\times p = 14(0.63) = 8.82 \quad \text{ and } \quad n\times q = 14(0.37) = 5.18\text{.} \end{equation*}Notice that both of these are very close to 5. This means we can just barely use a normal approximation. Next, the mean and standard deviation are
\begin{align*} \mu \amp = n\times p = 8.82\\ \sigma \amp = \sqrt{n\times p\times q} = \sqrt{14(0.63)(0.37)} \approx 1.8065\text{.} \end{align*}We wish to know \(P(X \geq 10)\text{.}\) This region, along with the continuity correction, is illustrated below.
Figure 3.4.39. Normal Approximation Completing the computation, we get the following probability.
\begin{align*} P(X \geq 10) \amp = P(Y > 9.5)\\ \amp = P\left(\frac{Y-\mu}{\sigma} \gt \frac{9.5 - 8.82}{1.8065}\right)\\ \amp = P(Z \gt 0.38)\\ \amp = 1 - P(Z \lt 0.38)\\ \amp = 1 - 0.6480\\ \amp = 0.3520\text{.} \end{align*}The approximation of \(0.3520\) is close to the actual probability of \(0.3622\text{,}\) but we are about one one-hundredth off.
Notice that in this case, \(n\times p\) and \(n\times q\) were very close to 5. Therefore, the approximation was acceptable, but not great. In the next example, let's look at what happens when \(n\times p\) and \(n\times q\) are much larger than 5.
Example 3.4.40. Approximating with Many Trials.
Suppose that 13% of people are left-handed. In a school of 200 students, what is the probability that fewer than 20 students are left-handed? Find this probability using both:
the binomial probability formula, and
a normal approximation.
According to the problem, \(n = 200\text{,}\) \(p = 0.13\text{,}\) \(q = 1-0.13 = 0.87\text{,}\) and we want \(P(X \lt 20)\text{.}\)
-
Using the binomial probability formula, this means we need
\begin{align*} P(X=0) \amp + P(X=1) + P(X=2) + P(X=3) + P(X=4)\\ \amp + P(X=5) + P(X=6) + P(X=7) + P(X=8)\\ \amp + P(X=9) + P(X=11) + P(X=12) + P(X=13)\\ \amp + P(X=14) + P(X=15) + P(X=16) + P(X=17)\\ \amp + P(X=18) + P(X=19)\text{.} \end{align*}In the interest of sanity, we used a computer (spreadsheet programs can perform these computations easily) to get \(P(X \lt 20) \approx 0.0817\text{.}\)
-
We first check that a normal approximation is appropriate.
\begin{equation*} n\times p = 200(0.13) = 26 \quad \text{ and } \quad n\times q = 200(0.87) = 17\text{.} \end{equation*}Both of these are much bigger than 5, so we expect a good approximation. Next the mean and standard deviation for the normal approximation must be computed.
\begin{align*} \mu \amp = n\times p = 200(0.13) = 26\\ \sigma \amp = \sqrt{n\times p\times q} = \sqrt{200(0.13)(0.87)} \approx 4.7560\text{.} \end{align*}Applying the continuity correction, we see that we need the shaded region shown below, which does not include \(X=20\text{.}\)
Figure 3.4.41. Normal Approximation This produces the following computation.
\begin{align*} P(X \lt 20) \amp = P(Y \lt 19.5)\\ \amp = P\left(\frac{Y-\mu}{\sigma} \lt \frac{19.5 - 26}{4.7560}\right)\\ \amp = P(Z \lt -1.37)\\ \amp = 0.0853\text{.} \end{align*}This approximation is accurate to about 4 one-thousandths. Much better than the 1 one- hundredth we saw in the previous example.
Checkpoint 3.4.44.
The normal approximation to the binomial distribution is always accurate to at least four decimals places.
Question: is the above statement true or false?
False
Checkpoint 3.4.45.
A normal approximation to a binomial probability will be especially good when \(n\times p\) and \(n\times q\) are very close to 5.
Question: is the above statement true or false?
False
Checkpoint 3.4.46.
If it is just as easy to use the binomial probability formula to compute a binomial probability as it would be to use a normal approximation, then we should use the binomial probability formula.
Question: is the above statement true or false?
True