Problems

2016 Paper 2 Q13

D: 1600.0 B: 1516.0

probability binomial distribution Poisson distribution normal approximation Stirling's approximation continuity correction asymptotic approximation standard normal distribution

The random variable $X$ has a binomial distribution with parameters $n$ and $p$, where $n=16$ and $p=\frac12$. Show, using an approximation in terms of the standard normal density function $\displaystyle \tfrac{1}{\sqrt{2\pi}} \, \e ^{-\frac12 x^2} $, that \[ \P(X=8) \approx \frac 1{2\sqrt{2\pi}} \,. \]
By considering a binomial distribution with parameters $2n$ and $\frac12$, show that \[ (2n)! \approx \frac {2^{2n} (n!)^2}{\sqrt{n\pi}} \,. \]
By considering a Poisson distribution with parameter $n$, show that \[ n! \approx \sqrt{2\pi n\, } \, \e^{-n} \, n^n \,. \]

View

2002 Paper 2 Q12

D: 1600.0 B: 1500.6

probability approximating binomial to normal binomial distribution normal approximation continuity correction nested binomial cumulative distribution function critical evaluation

On $K$ consecutive days each of $L$ identical coins is thrown $M$ times. For each coin, the probability of throwing a head in any one throw is $p$ (where $0 < p < 1$). Show that the probability that on exactly $k$ of these days more than $l$ of the coins will each produce fewer than $m$ heads can be approximated by \[ {K \choose k}q^k(1-q)^{K-k}, \] where \[ q=\Phi\left( \frac{2h-2l-1}{2\sqrt{h} }\right), \ \ \ \ \ \ h=L\Phi\left( \frac{2m-1-2Mp}{2\sqrt{ Mp(1-p)}}\right) \] and $\Phi(\cdot)$ is the cumulative distribution function of a standard normal variate. Would you expect this approximation to be accurate in the case $K=7$, $k=2$, $L=500$, $l=4$, $M=100$, $m=48$ and $p=0.6\;$?

Solution: Let $H_i$ be the random variable of how many heads the $i$th coin throws on a given day. Then $H_i \sim B(M,p)$, and the probability that a given coin produces fewer than $m$ heads is $p_h = \P(H_i < m)$ Let $C$ be the random variable the number of coins producing fewer than $m$ heads, then $C \sim B(L, p_h)$. The probability that more than $l$ of the coins produce fewer than $m$ heads is therefore $\P(C > l)$. Finally, the probability that on exactly $k$ days more than $l$ of the coins will produce fewer than $m$ heads is: \[ \binom{K}{k} \cdot \P(C > l)^k \cdot (1-\P(C > l))^{K-k} \] Let's start by assuming that all our Binomials can be approximated by a normal distribution. $B(M,p) \approx N(Mp, Mp(1-p))$ and so: \begin{align*} p_h &= \P(H_i < m) \\ &\approx \P( \sqrt{Mp(1-p)}Z+Mp < m-\frac12) \\ &= \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \\ &= \Phi\l\frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \end{align*} $B(L, p_h) \approx B \l L, \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r\r = B(L, \frac{h}{L}) \approx N(h, \frac{h(L-h)}{L})$ Therefore \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1- \P \l \sqrt{\frac{h(L-h)}{L}} Z + h \leq l+\frac12 \r \\ &= 1 - \P \l Z \leq \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}}\r \\ &= 1- \Phi\l \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}} \r \\ &= \Phi\l \frac{2h-2l-1}{2\sqrt{\frac{h(L-h)}{L}}} \r \end{align*} If we can approximate $\sqrt{1-\frac{h}{L}}$ by $1$ then we obtain the approximation in the question. Alternatively, $B(L, \frac{h}{L}) \approx Po(h)$ and $Po(h) \approx N(h,h)$ so we obtain: \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1 - \P(\sqrt{h} Z +h < l + \frac12) \\ &= 1 - \P \l Z < \frac{2l-2h+1}{2\sqrt{h}} \r \\ &= \Phi \l \frac{2h - 2l -1}{2\sqrt{h}}\r \end{align*} as required. [I think this is what the examiners expected]. Considering the case $K=7$, $k=2$, $L=500$, $l=4$, $M=100$, $m=48$ and $p=0.6$, we have the first normal approximation depends on $Mp$ and $M(1-p)$ being large. They are $60$ and $40$ respectively, so this is likely a good approximation. The first approximation finds that \begin{align*} h &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{-25}{2 \sqrt{24}} \r \\ &\approx 500 \cdot \Phi (-2.5) \\ &= 500 \cdot 0.0062 \\ &\approx 3.1 \end{align*} The second binomial approximation will be good if $500 \cdot \frac{3.1}{500} = 3.1$ is large, but this is quite small. Therefore, we shouldn't expect this to be a good approximation. However, since $m = 48$ is far from the mean (in a normalised sense), we might expect the percentage error to be large. [Alternatively, using what I expect the desired approach] The approximation of $B(L, \frac{h}{L}) \approx Po(h)$ is acceptable since $n>50$ and $h < 5$. The approximation of $Po(h) \sim N(h,h)$ is not acceptable since $h$ is small (in particular $h < 15$) Finally, we can compute all these values exactly using a modern calculator. \begin{array}{l|cc} & \text{correct} & \text{approx} \\ \hline p_h & 0.005760\ldots & 0.005362\ldots \\ \P(C > l) & 0.164522\ldots & 0.133319\ldots \\ \text{ans} & 0.231389\ldots & 0.182516\ldots \end{array} We can also see how the errors propagate, by doing the calculations assuming the previous steps are correct, and also including the Poisson step. \begin{array}{lccc} & \text{correct} & \text{approx} & \text{using approx } p_h \\ \hline p_h & 0.005760\ldots & 0.005362\ldots & - \\ \P(C > l)\quad [Po(h)] & 0.164522\ldots & 0.165044\ldots & 0.134293\ldots \\ \P(C > l)\quad [N(h,h)] & 0.164522\ldots & 0.169953\ldots & 0.133319\ldots \\ \P(C > l)\quad [N(h,h(1-\frac{h}{L})] & 0.164522\ldots & 0.169255\ldots & 0.132677\ldots \\ \text{ans} & 0.231389\ldots & 0.231389\ldots \end{array} By doing this, we discover that the largest errors are actually coming not from approximating the second approximation but from the small absolute (but large relative error) in the first approximation. This is, in fact, a coincidence; we can observe it by investigating the specific values being used. The first approximation looks as follows:

You might not be able to tell, but there's actually two plots on this chart. However, let's zoom in on the area we are worried about:

We can see there are small differences, which could be large in percentage terms. (As we found when we computed them directly).

First, we can immediately see that if we just look at the distribution of $B(L, p_h)$ and $B(L, p_{h_\text{approx}})$ we get quite different results, even before we do any approximations.

If we plot the probability distribution of $B(L, p_h)$ vs $N(Lp_h, Lp_h(1-p_h))$ we find that it is not a great approximation.

However, the CDF happens to be a very good approximation *just* for the value we care about. Very lucky, but not possible for someone sitting STEP to know at the time!

View

1989 Paper 3 Q16

D: 1700.0 B: 1484.0

probability conditional probability approximating binomial to normal distribution normal approximation hypothesis testing population model joint probability table continuity correction

It is believed that the population of Ruritania can be described as follows:

$25\%$ are fair-haired and the rest are dark-haired;
$20\%$ are green-eyed and the rest hazel-eyed;
the population can also be divided into narrow-headed and broad-headed;
no narrow-headed person has green eyes and fair hair;
those who are green-eyed are as likely to be narrow-headed as broad-headed;
those who are green-eyed and broad-headed are as likely to be fair-headed as dark-haired;
half of the population is broad-headed and dark-haired;
a hazel-eyed person is as likely to be fair-haired and broad-headed as dark-haired and narrow-headed.

Find the proportion believed to be narrow-headed. I am acquainted with only six Ruritanians, all of whom are broad-headed. Comment on this observation as evidence for or against the given model. A random sample of 200 Ruritanians is taken and is found to contain 50 narrow-heads. On the basis of the given model, calculate (to a reasonable approximation) the probability of getting 50 or fewer narrow-heads. Comment on the result.

View

Problems

Filters

2016 Paper 2 Q13

2002 Paper 2 Q12

1989 Paper 3 Q16