Problems

Filters
Clear Filters

3 problems found

2016 Paper 2 Q13
D: 1600.0 B: 1516.0

  1. The random variable \(X\) has a binomial distribution with parameters \(n\) and \(p\), where \(n=16\) and \(p=\frac12\). Show, using an approximation in terms of the standard normal density function $\displaystyle \tfrac{1}{\sqrt{2\pi}} \, \e ^{-\frac12 x^2} $, that \[ \P(X=8) \approx \frac 1{2\sqrt{2\pi}} \,. \]
  2. By considering a binomial distribution with parameters \(2n\) and \(\frac12\), show that \[ (2n)! \approx \frac {2^{2n} (n!)^2}{\sqrt{n\pi}} \,. \]
  3. By considering a Poisson distribution with parameter \(n\), show that \[ n! \approx \sqrt{2\pi n\, } \, \e^{-n} \, n^n \,. \]


Solution:

  1. \(X \sim B(16, \tfrac12)\), then \(X \approx N(8, 2^2)\), in particular \begin{align*} && \mathbb{P}(X = 8) &\approx \mathbb{P} \left ( 8 - \frac12 \leq 2Z + 8 \leq 8 + \frac12 \right) \\ &&&= \mathbb{P} \left (-\frac14 \leq Z \leq \frac14 \right) \\ &&&= \int_{-\frac14}^{\frac14} \frac{1}{\sqrt{2 \pi}}e^{-\frac12 x^2} \d x \\ &&&\approx \frac{1}{\sqrt{2\pi}} \int_{-\frac14}^{\frac14} 1\d x\\ &&&= \frac{1}{2 \sqrt{2\pi}} \end{align*}
  2. Suppose \(X \sim B(2n, \frac12)\) then \(X \approx N(n, \frac{n}{2})\), and \begin{align*} && \mathbb{P}(X = n) &\approx \mathbb{P} \left ( n - \frac12 \leq \sqrt{\frac{n}{2}} Z + n \leq n + \frac12 \right) \\ &&&= \mathbb{P} \left ( - \frac1{\sqrt{2n}} \leq Z \leq \frac1{\sqrt{2n}}\right) \\ &&&= \int_{-\frac1{\sqrt{2n}}}^{\frac1{\sqrt{2n}}} \frac{1}{\sqrt{2 \pi}} e^{-\frac12 x^2} \d x \\ &&&\approx \frac{1}{\sqrt{n\pi}}\\ \Rightarrow && \binom{2n}{n}\frac1{2^n} \frac{1}{2^n} & \approx \frac{1}{\sqrt{n \pi}} \\ \Rightarrow && (2n)! &\approx \frac{2^{2n}(n!)^2}{\sqrt{n\pi}} \end{align*}
  3. \(X \sim Po(n)\), then \(X \approx N(n, (\sqrt{n})^2)\), therefore \begin{align*} && \mathbb{P}(X = n) &\approx \mathbb{P} \left (-\frac12 \leq \sqrt{n} Z \leq \frac12 \right) \\ &&&= \int_{-\frac{1}{2 \sqrt{n}}}^{\frac{1}{2 \sqrt{n}}} \frac{1}{\sqrt{2\pi}}e^{-\frac12 x^2} \d x \\ &&&\approx \frac{1}{\sqrt{2 \pi n}} \\ \Rightarrow && e^{-n} \frac{n^n}{n!} & \approx \frac{1}{\sqrt{2 \pi n}} \\ \Rightarrow && n! &\approx \sqrt{2 \pi n} e^{-n}n^n \end{align*}

2002 Paper 2 Q12
D: 1600.0 B: 1500.6

On \(K\) consecutive days each of \(L\) identical coins is thrown \(M\) times. For each coin, the probability of throwing a head in any one throw is \(p\) (where \(0 < p < 1\)). Show that the probability that on exactly \(k\) of these days more than \(l\) of the coins will each produce fewer than \(m\) heads can be approximated by \[ {K \choose k}q^k(1-q)^{K-k}, \] where \[ q=\Phi\left( \frac{2h-2l-1}{2\sqrt{h} }\right), \ \ \ \ \ \ h=L\Phi\left( \frac{2m-1-2Mp}{2\sqrt{ Mp(1-p)}}\right) \] and \(\Phi(\cdot)\) is the cumulative distribution function of a standard normal variate. Would you expect this approximation to be accurate in the case \(K=7\), \(k=2\), \(L=500\), \(l=4\), \(M=100\), \(m=48\) and \(p=0.6\;\)?


Solution: Let \(H_i\) be the random variable of how many heads the \(i\)th coin throws on a given day. Then \(H_i \sim B(M,p)\), and the probability that a given coin produces fewer than \(m\) heads is \(p_h = \P(H_i < m)\) Let \(C\) be the random variable the number of coins producing fewer than \(m\) heads, then \(C \sim B(L, p_h)\). The probability that more than \(l\) of the coins produce fewer than \(m\) heads is therefore \(\P(C > l)\). Finally, the probability that on exactly \(k\) days more than \(l\) of the coins will produce fewer than \(m\) heads is: \[ \binom{K}{k} \cdot \P(C > l)^k \cdot (1-\P(C > l))^{K-k} \] Let's start by assuming that all our Binomials can be approximated by a normal distribution. \(B(M,p) \approx N(Mp, Mp(1-p))\) and so: \begin{align*} p_h &= \P(H_i < m) \\ &\approx \P( \sqrt{Mp(1-p)}Z+Mp < m-\frac12) \\ &= \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \\ &= \Phi\l\frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \end{align*} \(B(L, p_h) \approx B \l L, \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r\r = B(L, \frac{h}{L}) \approx N(h, \frac{h(L-h)}{L})\) Therefore \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1- \P \l \sqrt{\frac{h(L-h)}{L}} Z + h \leq l+\frac12 \r \\ &= 1 - \P \l Z \leq \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}}\r \\ &= 1- \Phi\l \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}} \r \\ &= \Phi\l \frac{2h-2l-1}{2\sqrt{\frac{h(L-h)}{L}}} \r \end{align*} If we can approximate \(\sqrt{1-\frac{h}{L}}\) by \(1\) then we obtain the approximation in the question. Alternatively, \(B(L, \frac{h}{L}) \approx Po(h)\) and \(Po(h) \approx N(h,h)\) so we obtain: \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1 - \P(\sqrt{h} Z +h < l + \frac12) \\ &= 1 - \P \l Z < \frac{2l-2h+1}{2\sqrt{h}} \r \\ &= \Phi \l \frac{2h - 2l -1}{2\sqrt{h}}\r \end{align*} as required. [I think this is what the examiners expected]. Considering the case \(K=7\), \(k=2\), \(L=500\), \(l=4\), \(M=100\), \(m=48\) and \(p=0.6\), we have the first normal approximation depends on \(Mp\) and \(M(1-p)\) being large. They are \(60\) and \(40\) respectively, so this is likely a good approximation. The first approximation finds that \begin{align*} h &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{-25}{2 \sqrt{24}} \r \\ &\approx 500 \cdot \Phi (-2.5) \\ &= 500 \cdot 0.0062 \\ &\approx 3.1 \end{align*} The second binomial approximation will be good if \(500 \cdot \frac{3.1}{500} = 3.1\) is large, but this is quite small. Therefore, we shouldn't expect this to be a good approximation. However, since \(m = 48\) is far from the mean (in a normalised sense), we might expect the percentage error to be large. [Alternatively, using what I expect the desired approach] The approximation of \(B(L, \frac{h}{L}) \approx Po(h)\) is acceptable since \(n>50\) and \(h < 5\). The approximation of \(Po(h) \sim N(h,h)\) is not acceptable since \(h\) is small (in particular \(h < 15\)) Finally, we can compute all these values exactly using a modern calculator. \begin{array}{l|cc} & \text{correct} & \text{approx} \\ \hline p_h & 0.005760\ldots & 0.005362\ldots \\ \P(C > l) & 0.164522\ldots & 0.133319\ldots \\ \text{ans} & 0.231389\ldots & 0.182516\ldots \end{array} We can also see how the errors propagate, by doing the calculations assuming the previous steps are correct, and also including the Poisson step. \begin{array}{lccc} & \text{correct} & \text{approx} & \text{using approx } p_h \\ \hline p_h & 0.005760\ldots & 0.005362\ldots & - \\ \P(C > l)\quad [Po(h)] & 0.164522\ldots & 0.165044\ldots & 0.134293\ldots \\ \P(C > l)\quad [N(h,h)] & 0.164522\ldots & 0.169953\ldots & 0.133319\ldots \\ \P(C > l)\quad [N(h,h(1-\frac{h}{L})] & 0.164522\ldots & 0.169255\ldots & 0.132677\ldots \\ \text{ans} & 0.231389\ldots & 0.231389\ldots \end{array} By doing this, we discover that the largest errors are actually coming not from approximating the second approximation but from the small absolute (but large relative error) in the first approximation. This is, in fact, a coincidence; we can observe it by investigating the specific values being used. The first approximation looks as follows:

TikZ diagram
You might not be able to tell, but there's actually two plots on this chart. However, let's zoom in on the area we are worried about:
TikZ diagram
We can see there are small differences, which could be large in percentage terms. (As we found when we computed them directly).
TikZ diagram
First, we can immediately see that if we just look at the distribution of \(B(L, p_h)\) and \(B(L, p_{h_\text{approx}})\) we get quite different results, even before we do any approximations.
TikZ diagram
If we plot the probability distribution of \(B(L, p_h)\) vs \(N(Lp_h, Lp_h(1-p_h))\) we find that it is not a great approximation.
TikZ diagram
However, the CDF happens to be a very good approximation *just* for the value we care about. Very lucky, but not possible for someone sitting STEP to know at the time!

1989 Paper 3 Q16
D: 1700.0 B: 1484.0

It is believed that the population of Ruritania can be described as follows:

  1. \(25\%\) are fair-haired and the rest are dark-haired;
  2. \(20\%\) are green-eyed and the rest hazel-eyed;
  3. the population can also be divided into narrow-headed and broad-headed;
  4. no narrow-headed person has green eyes and fair hair;
  5. those who are green-eyed are as likely to be narrow-headed as broad-headed;
  6. those who are green-eyed and broad-headed are as likely to be fair-headed as dark-haired;
  7. half of the population is broad-headed and dark-haired;
  8. a hazel-eyed person is as likely to be fair-haired and broad-headed as dark-haired and narrow-headed.
Find the proportion believed to be narrow-headed. I am acquainted with only six Ruritanians, all of whom are broad-headed. Comment on this observation as evidence for or against the given model. A random sample of 200 Ruritanians is taken and is found to contain 50 narrow-heads. On the basis of the given model, calculate (to a reasonable approximation) the probability of getting 50 or fewer narrow-heads. Comment on the result.


Solution:

TikZ diagram
Conditions tell us: \begin{align*} && a+b+d+e &= 0.25 \\ && b+c+e+f &= 0.2 \\ && e &= 0 \\ && b+c &= e + f \\ && b &= c \\ && c+h &= 0.5 \\ && a &= g \\ \end{align*}
TikZ diagram
So \(4b = 0.2 \Rightarrow b = 0.05\)
TikZ diagram
And \begin{align*} && 0.25 &= a + d + 0.05 \\ && 1 &= 2a + d + 0.65 \\ \Rightarrow && a &= 0.15 \\ && d &= 0.05 \end{align*}
TikZ diagram
So the proportion who are narrow-headed is \(30\%\). It's obviously relatively unlikely for your six Ruritanian friends to all be broad-headed if it's a random sample, but friendship groups are are likely to be biased so it's not too surprising. Assuming there is a sufficiently large number of Ruritanians, we might model the number of narrow-headed Ruritanians from a sample of \(200\) as \(X \sim B(200, 0.3)\). Computing \(\mathbb{P}(X \leq 50)\) by hand is tricky, so let's use a binomial approximation to obtain: \(X \approx N(60, 42)\) and \begin{align*} \mathbb{P}(X \leq 50) &\approx \mathbb{P} \left (Z \leq \frac{50 - 60+0.5}{\sqrt{42}} \right) \\ &\approx \mathbb{P} \left (Z \leq -\frac{9.5}{6.5} \right) \\ &\approx \mathbb{P} \left (Z \leq -\frac{3}{2} \right) \\ &\approx 5\% \end{align*} (actually this approximation gives \(7.1\%\) and the binomial value gives \(7.0\%\)). This also seems somewhat surprising