Problems

Filters
Clear Filters

45 problems found

2009 Paper 2 Q13
D: 1600.0 B: 1500.0

Satellites are launched using two different types of rocket: the Andover and the Basingstoke. The Andover has four engines and the Basingstoke has six. Each engine has a probability~\(p\) of failing during any given launch. After the launch, the rockets are retrieved and repaired by replacing some or all of the engines. The cost of replacing each engine is \(K\). For the Andover, if more than one engine fails, all four engines are replaced. Otherwise, only the failed engine (if there is one) is replaced. Show that the expected repair cost for a single launch using the Andover is \[ 4Kp(1+q+q^2-2q^3) \ \ \ \ \ \ \ \ \ \ \ \ \ (q=1-p) \tag{*} \] For the Basingstoke, if more than two engines fail, all six engines are replaced. Otherwise only the failed engines (if there are any) are replaced. Find, in a form similar to \((*)\), the expected repair cost for a single launch using the Basingstoke. Find the values of \(p\) for which the expected repair cost for the Andover is \(\frac23\) of the expected repair cost for the Basingstoke.

2009 Paper 3 Q12
D: 1700.0 B: 1516.0

  1. Albert tosses a fair coin \(k\) times, where \(k\) is a given positive integer. The number of heads he gets is \(X_1\). He then tosses the coin \(X_1\) times, getting \(X_2\) heads. He then tosses the coin \(X_2\) times, getting \(X_3\) heads. The random variables \(X_4\), \(X_5\), \(\ldots\) are defined similarly. Write down \(\E(X_1)\). By considering \(\E(X_2 \; \big\vert \; X_1 = x_1)\), or otherwise, show that \(\E(X_2) = \frac14 k\). Find \(\displaystyle \sum_{i=1}^\infty \E(X_i)\).
  2. Bertha has \(k\) fair coins. She tosses the first coin until she gets a tail. The number of heads she gets before the first tail is \(Y_1\). She then tosses the second coin until she gets a tail and the number of heads she gets with this coin before the first tail is \(Y_2\). The random variables \(Y_3, Y_4, \ldots\;\), \(Y_k\) are defined similarly, and \(Y= \sum\limits_{i=1}^k Y_i\,\). Obtain the probability generating function of \(Y\), and use it to find \(\E(Y)\), \(\var(Y)\) and \(\P(Y=r)\).


Solution:

  1. \(X_1 \sim B(k, \tfrac12)\) so \(\E[X_1] = \frac{k}{2}\) Note that \(X_2 | X_1 = x_1 \sim B(x_1, \tfrac12)\) so \(\E[X_2 | X_1 = x_1) = \frac{x_1}{2}\) or \(\E[X_2 | X_1] = \frac12 X_1\). Therefore by the tower law, \(\E[\E[X_2|X_1]] = \E[\frac12 X_1] = \frac14k\) Notice also that \(\E[X_n] = \frac1{2^n} k\) and so \begin{align*} && \sum_{i=1}^\infty \E[X_i] &= \sum_{i=1}^{\infty} \frac1{2^i} k \\ &&&= \frac{\frac12 k}{1-\frac12} = k \end{align*}
  2. Note that \(Y_1 \sim Geo(\tfrac12)-1\) which has generating function \(\E[t^{Y_1}] = \E[t^{G-1}] = \frac{\frac12 t}{1-(1-\frac12)t}\frac1{t} = \frac{\frac12}{1-\frac12t}\). Notice that \begin{align*} && \E \left [ t^Y \right] &= \E \left [ t^{\sum_{i=1}^kY_i} \right] \\ &&&= \prod_{i=1}^k \E[t^{Y_i}] \\ &&&= \frac{1}{(2-t)^k} \end{align*} Therefore \(\E[Y] = G'(1) = k(2-1)^{-(k+1)} = k\) \(\E[Y^2] = (tG'(t))'|_{t=1} = k(k+1)(2-1)^{-(k+2)}+k(2-1)^{-(k+1)} = k^2+2k\) so \(\var[Y] = k^2+2k - k^2 =2 k\). Finally \(\mathbb{P}(Y=r) = \binom{k+r-1}{k} \frac{1}{2^{r+k}}\)
[Note: this second distribution is a negative binomial distribution]

2006 Paper 2 Q12
D: 1600.0 B: 1516.0

A cricket team has only three bowlers, Arthur, Betty and Cuba, each of whom bowls 30 balls in any match. Past performance reveals that, on average, Arthur takes one wicket for every 36 balls bowled, Betty takes one wicket for every 25 balls bowled, and Cuba takes one wicket for every 41 balls bowled.

  1. In one match, the team took exactly one wicket, but the name of the bowler was not recorded. Using a binomial model, find the probability that Arthur was the bowler.
  2. Show that the average number of wickets taken by the team in a match is approximately 3. Give with brief justification a suitable model for the number of wickets taken by the team in a match and show that the probability of the team taking at least five wickets in a given match is approximately \(\frac15\). [You may use the approximation \(\e^3 = 20\).]


Solution:

  1. \(\,\) \begin{align*} && \mathbb{P}(\text{Arthur took wicket and exactly one wicket}) &= \binom{30}{1} \frac{1}{36} \left ( \frac{35}{36} \right)^{29} \binom{30}{0} \left ( \frac{24}{25} \right)^{30} \binom{30}{0} \left ( \frac{40}{41} \right)^{30}\\ &&&= \frac{30 \cdot 35^{29} \cdot 24^{30} \cdot 40^{30}}{36^{30} \cdot 25^{30} \cdot {41}^{30}}\\ &&&= \frac{1}{35} N\\ && \mathbb{P}(\text{B took wicket and exactly one wicket}) &= \binom{30}{0}\left ( \frac{35}{36} \right)^{30} \binom{30}{1} \frac{1}{25} \left ( \frac{24}{25} \right)^{29} \binom{30}{0} \left ( \frac{40}{41} \right)^{30}\\ &&&= \frac{1}{24} N \\ && \mathbb{P}(\text{C took wicket and exactly one wicket}) &= \binom{30}{0}\left ( \frac{35}{36} \right)^{30} \binom{30}{0}\left ( \frac{24}{25} \right)^{30} \binom{30}{1} \frac{1}{41} \left ( \frac{40}{41} \right)^{29}\\ &&&= \frac{1}{40} N \\ && \mathbb{P}(\text{Arthur took wicket} | \text{exactly one wicket}) &= \frac{ \mathbb{P}(\text{Arthur took wicket and exactly one wicket}) }{ \mathbb{P}(\text{exactly one wicket}) } \\ &&&= \frac{ \frac{1}{35} N}{\frac1{35} N + \frac{1}{24}N + \frac{1}{40} N} \\ &&&= \frac{3}{10} \end{align*} Alternatively, we could look at: \begin{align*} && \mathbb{P}(X_A = 1 | X_A + X_B + X_C =1) &= \frac{\mathbb{P}(X_A = 1, X_B = 0,X_C = 0)}{\mathbb{P}(X_A = 1, X_B = 0,X_C = 0)+\mathbb{P}(X_A = 0, X_B = 1,X_C = 0)+\mathbb{P}(X_A = 0, X_B = 0,X_C = 1)} \\ &&&= \frac{\frac{\mathbb{P}(X_A = 1)}{\mathbb{P}(X_A=0)}}{\frac{\mathbb{P}(X_A = 1)}{\mathbb{P}(X_A=0)}+\frac{\mathbb{P}(X_B = 1)}{\mathbb{P}(X_B=0)}+\frac{\mathbb{P}(X_C = 1)}{\mathbb{P}(X_C=0)}} \end{align*} and we can calculate these relatively likelihoods in a similar way to above.
  2. \(\,\) \begin{align*} && \mathbb{E}(\text{number of wickets}) &= \mathbb{E} \left ( \sum_{i=1}^{90} \mathbb{1}_{i\text{th ball is a wicket}} \right) \\ &&&= \sum_{i=1}^{90} \mathbb{E} \left (\mathbb{1}_{i\text{th ball is a wicket}} \right) \\ &&&= 30 \cdot \frac{1}{36} + 30 \cdot \frac{1}{25} + 30 \cdot \frac{1}{41} \\ &&&\approx 1 + 1 + 1 = 3 \end{align*} We might model the number of wickets taken as \(Po(\lambda)\), where \(\lambda\) is the average number of wickets taken. We can think of this roughly as the Poisson approximation to the binomial where \(N\) is large and \(Np\) is small. Assuming we use \(Po(3)\) we have \begin{align*} && \mathbb{P}(\text{at least 5 wickets}) &= 1-\mathbb{P}(\text{4 or fewer wickets}) \\ &&&= 1- e^{-3} \left (1 + \frac{3}{1} + \frac{3^2}{2} + \frac{3^3}{6} + \frac{3^4}{24} \right) \\ &&&= 1 - \frac{1}{20} \left ( 1 + 3 + \frac{9}{2} + \frac{9}{2} + \frac{27}{8} \right) \\ &&&= 1 - \frac{1}{20} \left (13 + 3\tfrac38 \right) \\ &&&\approx 1 - \frac{16}{20} = \frac15 \end{align*}

2006 Paper 3 Q12
D: 1700.0 B: 1500.0

Fifty times a year, 1024 tourists disembark from a cruise liner at a port. From there they must travel to the city centre either by bus or by taxi. Tourists are equally likely to be directed to the bus station or to the taxi rank. Each bus of the bus company holds 32 passengers, and the company currently runs 15 buses. The company makes a profit of \(\pounds\)1 for each passenger carried. It carries as many passengers as it can, with any excess being (eventually) transported by taxi. Show that the largest annual licence fee, in pounds, that the company should consider paying to be allowed to run an extra bus is approximately \[ 1600 \Phi(2) - \frac{800}{\sqrt{2\pi}}\big(1- \e^{-2}\big)\,, \] where \(\displaystyle \Phi(x) =\dfrac1{\sqrt{2\pi}} \int_{-\infty}^x \e^{-\frac12t^2}\d t\,\). You should not consider continuity corrections.


Solution: The the number of people being directed towards the buses (each cruise) is \(X \sim B(1024, \tfrac12) \approx N(512, 256) \approx 16Z + 512\). Therefore without an extra bus, the expected profit is \(\mathbb{E}[\min(X, 15 \times 32)]\). With the extra bus, the extra profit is \(\mathbb{E}[\min(X, 16 \times 32)]\), therefore the expected extra profit is: \(\mathbb{E}[\min(X, 16 \times 32)]-\mathbb{E}[\min(X, 15 \times 32)] = \mathbb{E}[\min(X, 16 \times 32)-\min(X, 15 \times 32)] \) \begin{align*} \text{Expected extra profit} &= \mathbb{E}[\min(X, 16 \times 32)-\min(X, 15 \times 32)] \\ &= \mathbb{E}[\min(16Z+512, 16 \times 32)-\min(16Z+512, 15 \times 32)] \\ &= 16\mathbb{E}[\min(Z+32, 32)-\min(Z+32, 30)] \\ &=16\int_{-\infty}^{\infty} \left (\min(Z+32, 32)-\min(Z+32, 30) \right)p_Z(z) \d z \\ &= 16 \left ( \int_{-2}^{0} (z+32-30) p_Z(z) \d z + \int_0^\infty (32-30)p_Z(z) \d z \right) \\ &= 16 \left ( \int_{-2}^{0} (z+2) p_Z(z) \d z + \int_0^\infty 2p_Z(z) \d z \right) \\ &= 16 \left ( \int_{-2}^{0} zp_Z(z) \d z + 2\int_{-2}^\infty p_Z(z) \d z \right) \\ &= 16 \left ( \int_{-2}^{0} z \frac{1}{\sqrt{2\pi}} e^{-\frac12 z^2} \d z + 2(1-\Phi(2)) \right) \\ &= 32(1-\Phi(2)) + \frac{16}{\sqrt{2\pi}} \left [ -e^{-\frac12z^2} \right]_{-2}^0 \\ &= 32(1-\Phi(2)) - \frac{16}{\sqrt{2\pi}} \left ( 1-e^{-2}\right) \end{align*} Across \(50\) different runs, this profit is \[ 1600(1-\Phi(2)) - \frac{800}{\sqrt{2\pi}} \left ( 1-e^{-2}\right) \]

2005 Paper 2 Q13
D: 1600.0 B: 1500.0

The number of printing errors on any page of a large book of \(N\) pages is modelled by a Poisson variate with parameter \(\lambda\) and is statistically independent of the number of printing errors on any other page. The number of pages in a random sample of \(n\) pages (where \(n\) is much smaller than \(N\) and \(n\ge2\)) which contain fewer than two errors is denoted by \(Y\). Show that \(\P(Y=k) = \binom n k p^kq^{n-k}\) where \(p=(1+\lambda)e^{-\lambda}\) and \(q=1-p\,\). Show also that, if \(\lambda\) is sufficiently small,

  1. \(q\approx \frac12 \lambda^2\,\);
  2. the largest value of \(n\) for which \(\P(Y=n)\ge 1-\lambda\) is approximately \(2/\lambda\,\);
  3. \(\P(Y>1 \;\vert\; Y>0) \approx 1-n(\lambda^2/2)^{n-1}\;.\)


Solution: First notice that the the probability a page contains fewer than two errors is \(\mathbb{P}(X < 2)\) where \(X \sim Po(\lambda)\), ie \(\mathbb{P}(X<2) = e^{-\lambda} + \lambda e^{-\lambda} = (1+\lambda)e^{-\lambda}\). Therefore the number of pages \(Y\) with fewer than two errors out of our sample of \(n\) is \(Bin(n, p)\) where \(p\) is as before. ie \(\mathbb{P}(Y = k) = \binom{n}{k} p^kq^{n-k}\).

  1. \(\,\) \begin{align*} && q &= 1- p = 1-(1+\lambda)e^{-\lambda} \\ &&&= 1 - (1+ \lambda)(1 - \lambda + \tfrac12 \lambda^2 + o(\lambda^3)) \\ &&&= 1 - 1+ \lambda - \lambda+\lambda^2 - \tfrac12 \lambda^2 + o(\lambda^3) \\ &&&= \tfrac12 \lambda^2 + o(\lambda^3) \end{align*}
  2. \(\,\) \begin{align*} && \mathbb{P}(Y = n) &= p^n \\ &&&= (1+\lambda)^ne^{-\lambda n} \\ &&&= (1 + n \lambda + \frac{n(n-1)}{2} \lambda^2 + \cdots)(1 - \lambda n + \frac{\lambda^2 n^2}{2} + \cdots) \\ &&&= 1 + 0 \lambda + \left ( \frac{n(n-1)}{2} + \frac{n^2}{2} - n^2 \right) \lambda^2 + o(\lambda^3) \\ &&&= 1 - \frac{n}{2} \lambda^2 + o(\lambda^3) \end{align*} So if \(\frac{n}{2} \lambda \leq 1\) or \(n \leq \frac{2}{\lambda}\) \(\mathbb{P}(Y = n) \leq 1- \lambda\).
  3. \(\,\) \begin{align*} && \mathbb{P}(Y > 1 | Y > 0) &= \frac{1-(q^n + npq^{n-1})}{1-q^n} \\ &&&= 1 - \frac{npq^{n-1}}{1-q^n} \\ &&&= 1 -n \frac{(1+ \lambda)e^{-\lambda} (\tfrac12 \lambda^2 + o(\lambda^3))^{n-1}}{1-(\tfrac12 \lambda^2 + o(\lambda^3))^n} \\ &&&= 1 - n \left (\frac{\lambda^2}{2} \right)^{n-1} \frac{(1+ \lambda)(1-\lambda + \lambda^2/2 - \cdots)(1+o(\lambda)^{n-1}}{1-(\tfrac12 \lambda^2 + o(\lambda^3))^n} \\ &&&= 1 - n \left (\frac{\lambda^2}{2} \right)^{n-1} (1 + o(\lambda)) \\ &&&\approx 1 - n \left (\frac{\lambda^2}{2} \right)^{n-1} \end{align*}

2003 Paper 1 Q13
D: 1484.0 B: 1518.1

If a football match ends in a draw, there may be a "penalty shoot-out". Initially the teams each take 5 shots at goal. If one team scores more times than the other, then that team wins. If the scores are level, the teams take shots alternately until one team scores and the other team does not score, both teams having taken the same number of shots. The team that scores wins. Two teams, Team A and Team B, take part in a penalty shoot-out. Their probabilities of scoring when they take a single shot are \(p_A\) and \(p_B\) respectively. Explain why the probability \(\alpha\) of neither side having won at the end of the initial \(10\)-shot period is given by $$\alpha =\sum_{i=0}^5\binom{5}{i}^2(1-p_A)^i(1-p_B)^i\,p_A^{5-i}p_B^{5-i}.$$ Show that the expected number of shots taken is \(\displaystyle 10+ \frac{2\alpha}\beta\;,\) where \(\beta=p_A+p_B-2p_Ap_B\,.\)


Solution: Note that in the first \(10\)-short period the number of goals scored by each team is \(B(5, \p_i)\). For them to be equal they must both have scored the same number of goals, ie \begin{align*} && \alpha &= \sum_{i=0}^5 \mathbb{P}(\text{both teams score }5-i) \\ &&&= \sum_{i=0}^5 \binom{5}{i} (1-p_A)^ip_A^{5-i} \binom{5}{i} (1-p_B)^i p_B^{5-i} \\ &&&= \sum_{i=0}^5 \binom{5}{i} ^2(1-p_A)^i (1-p_B)^i p_A^{5-i} p_B^{5-i} \\ \end{align*} Suppose we make it to the end of the shoot out with scores tied. The probability that we finish each round is \(p_A(1-p_B) + p_B(1-p_A)\) (the probability \(A\) wins or \(B\) wins). This is \(p_A + p_B - 2p_Ap_B = \beta\)). Therefore the number of additional rounds is geometric with parameter \(\beta\) and the expected number of rounds is \(\frac{1}{\beta}\). Each round has two shots, and there is a probability \(\alpha\) of this occuring, ie \(\frac{2\alpha}{\beta}\). Added to the \(10\) guaranteed shots we get the desired result

2003 Paper 2 Q12
D: 1600.0 B: 1484.0

The life of a certain species of elementary particles can be described as follows. Each particle has a life time of \(T\) seconds, after which it disintegrates into \(X\) particles of the same species, where \(X\) is a random variable with binomial distribution \(\mathrm{B}(2,p)\,\). A population of these particles starts with the creation of a single such particle at \(t=0\,\). Let \(X_n\) be the number of particles in existence in the time interval \(nT < t < (n+1)T\,\), where \(n=1\,\), \(2\,\), \(\ldots\). Show that \(\P(X_1=2 \mbox { and } X_2=2) = 6p^4q^2\;\), where \(q=1-p\,\). Find the possible values of \(p\) if it is known that \(\P(X_1=2 \vert X_2=2) =9/25\,\). Explain briefly why \(\E(X_n) =2p\E(X_{n-1})\) and hence determine \(\E(X_n)\) in terms of \(p\). Show that for one of the values of \(p\) found above \(\lim_{n \to \infty}\E(X_n) = 0\) and that for the other \(\lim_{n \to \infty}\E(X_n) = + \infty\,\).


Solution: Notice that we can see the total number generated as \(X_n \sim B(2X_{n-1},p)\), since a Binomial is a sum of independent Bernoullis, and there are two Bernoullis per particle. \begin{align*} && \mathbb{P}(X_1=2 \mbox { and } X_2=2) &= \underbrace{p^2}_{\text{two generated in first iteration}} \cdot \underbrace{\binom{4}{2}p^2q^2}_{\text{two generated from the first two}} \\ &&&= 6p^4q^2 \end{align*} \begin{align*} && \mathbb{P})(X_1 = 2 |X_2 = 2) &= \frac{ \mathbb{P}(X_1=2 \mbox { and } X_2=2) }{ \mathbb{P}( X_2=2) } \\ &&&= \frac{6p^4q^2}{6p^4q^2+2pq \cdot p^2} \\ &&&= \frac{3pq}{3pq+1} \\ \Rightarrow && \frac{9}{25} &= \frac{3pq}{3pq+1} \\ \Rightarrow && 27pq + 9 &= 75pq \\ \Rightarrow && 9 &= 48pq \\ \Rightarrow && pq &= \frac{3}{16} \\ \Rightarrow && 0 &= p^2 - p + \frac3{16} \\ \Rightarrow && p &= \frac14, \frac34 \end{align*} By the same reasoning about the Bernoullis, we must have \(\E[X_n] = \E[\E[X_n | X_{n-1}]] = \E[2pX_{n-1}] = 2p \E[X_{n-1}]\) therefore \(\E[X_n] = (2p)^n\). If \(p = \frac14\) then \(\E[X_n] = \frac1{2^n} \to 0\) If \(p = \frac34\) then \(\E[X_n] = \left(\frac32 \right)^n \to \infty\)

2002 Paper 2 Q12
D: 1600.0 B: 1500.6

On \(K\) consecutive days each of \(L\) identical coins is thrown \(M\) times. For each coin, the probability of throwing a head in any one throw is \(p\) (where \(0 < p < 1\)). Show that the probability that on exactly \(k\) of these days more than \(l\) of the coins will each produce fewer than \(m\) heads can be approximated by \[ {K \choose k}q^k(1-q)^{K-k}, \] where \[ q=\Phi\left( \frac{2h-2l-1}{2\sqrt{h} }\right), \ \ \ \ \ \ h=L\Phi\left( \frac{2m-1-2Mp}{2\sqrt{ Mp(1-p)}}\right) \] and \(\Phi(\cdot)\) is the cumulative distribution function of a standard normal variate. Would you expect this approximation to be accurate in the case \(K=7\), \(k=2\), \(L=500\), \(l=4\), \(M=100\), \(m=48\) and \(p=0.6\;\)?


Solution: Let \(H_i\) be the random variable of how many heads the \(i\)th coin throws on a given day. Then \(H_i \sim B(M,p)\), and the probability that a given coin produces fewer than \(m\) heads is \(p_h = \P(H_i < m)\) Let \(C\) be the random variable the number of coins producing fewer than \(m\) heads, then \(C \sim B(L, p_h)\). The probability that more than \(l\) of the coins produce fewer than \(m\) heads is therefore \(\P(C > l)\). Finally, the probability that on exactly \(k\) days more than \(l\) of the coins will produce fewer than \(m\) heads is: \[ \binom{K}{k} \cdot \P(C > l)^k \cdot (1-\P(C > l))^{K-k} \] Let's start by assuming that all our Binomials can be approximated by a normal distribution. \(B(M,p) \approx N(Mp, Mp(1-p))\) and so: \begin{align*} p_h &= \P(H_i < m) \\ &\approx \P( \sqrt{Mp(1-p)}Z+Mp < m-\frac12) \\ &= \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \\ &= \Phi\l\frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \end{align*} \(B(L, p_h) \approx B \l L, \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r\r = B(L, \frac{h}{L}) \approx N(h, \frac{h(L-h)}{L})\) Therefore \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1- \P \l \sqrt{\frac{h(L-h)}{L}} Z + h \leq l+\frac12 \r \\ &= 1 - \P \l Z \leq \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}}\r \\ &= 1- \Phi\l \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}} \r \\ &= \Phi\l \frac{2h-2l-1}{2\sqrt{\frac{h(L-h)}{L}}} \r \end{align*} If we can approximate \(\sqrt{1-\frac{h}{L}}\) by \(1\) then we obtain the approximation in the question. Alternatively, \(B(L, \frac{h}{L}) \approx Po(h)\) and \(Po(h) \approx N(h,h)\) so we obtain: \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1 - \P(\sqrt{h} Z +h < l + \frac12) \\ &= 1 - \P \l Z < \frac{2l-2h+1}{2\sqrt{h}} \r \\ &= \Phi \l \frac{2h - 2l -1}{2\sqrt{h}}\r \end{align*} as required. [I think this is what the examiners expected]. Considering the case \(K=7\), \(k=2\), \(L=500\), \(l=4\), \(M=100\), \(m=48\) and \(p=0.6\), we have the first normal approximation depends on \(Mp\) and \(M(1-p)\) being large. They are \(60\) and \(40\) respectively, so this is likely a good approximation. The first approximation finds that \begin{align*} h &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{-25}{2 \sqrt{24}} \r \\ &\approx 500 \cdot \Phi (-2.5) \\ &= 500 \cdot 0.0062 \\ &\approx 3.1 \end{align*} The second binomial approximation will be good if \(500 \cdot \frac{3.1}{500} = 3.1\) is large, but this is quite small. Therefore, we shouldn't expect this to be a good approximation. However, since \(m = 48\) is far from the mean (in a normalised sense), we might expect the percentage error to be large. [Alternatively, using what I expect the desired approach] The approximation of \(B(L, \frac{h}{L}) \approx Po(h)\) is acceptable since \(n>50\) and \(h < 5\). The approximation of \(Po(h) \sim N(h,h)\) is not acceptable since \(h\) is small (in particular \(h < 15\)) Finally, we can compute all these values exactly using a modern calculator. \begin{array}{l|cc} & \text{correct} & \text{approx} \\ \hline p_h & 0.005760\ldots & 0.005362\ldots \\ \P(C > l) & 0.164522\ldots & 0.133319\ldots \\ \text{ans} & 0.231389\ldots & 0.182516\ldots \end{array} We can also see how the errors propagate, by doing the calculations assuming the previous steps are correct, and also including the Poisson step. \begin{array}{lccc} & \text{correct} & \text{approx} & \text{using approx } p_h \\ \hline p_h & 0.005760\ldots & 0.005362\ldots & - \\ \P(C > l)\quad [Po(h)] & 0.164522\ldots & 0.165044\ldots & 0.134293\ldots \\ \P(C > l)\quad [N(h,h)] & 0.164522\ldots & 0.169953\ldots & 0.133319\ldots \\ \P(C > l)\quad [N(h,h(1-\frac{h}{L})] & 0.164522\ldots & 0.169255\ldots & 0.132677\ldots \\ \text{ans} & 0.231389\ldots & 0.231389\ldots \end{array} By doing this, we discover that the largest errors are actually coming not from approximating the second approximation but from the small absolute (but large relative error) in the first approximation. This is, in fact, a coincidence; we can observe it by investigating the specific values being used. The first approximation looks as follows:

TikZ diagram
You might not be able to tell, but there's actually two plots on this chart. However, let's zoom in on the area we are worried about:
TikZ diagram
We can see there are small differences, which could be large in percentage terms. (As we found when we computed them directly).
TikZ diagram
First, we can immediately see that if we just look at the distribution of \(B(L, p_h)\) and \(B(L, p_{h_\text{approx}})\) we get quite different results, even before we do any approximations.
TikZ diagram
If we plot the probability distribution of \(B(L, p_h)\) vs \(N(Lp_h, Lp_h(1-p_h))\) we find that it is not a great approximation.
TikZ diagram
However, the CDF happens to be a very good approximation *just* for the value we care about. Very lucky, but not possible for someone sitting STEP to know at the time!

2002 Paper 3 Q14
D: 1700.0 B: 1500.0

Prove that, for any two discrete random variables \(X\) and \(Y\), \[ \mathrm{Var} \left(X + Y \right) = \mathrm{Var}(X) + \mathrm{Var}(Y) + 2 \, \mathrm{Cov}(X,Y), \] where \(\mathrm{Var}(X)\) is the variance of \(X\) and \(\mathrm{Cov}(X,Y)\) is the covariance of \(X\) and \(Y\). When a Grandmaster plays a sequence of \(m\) games of chess, she is, independently, equally likely to win, lose or draw each game. If the values of the random variables \(W\), \(L\) and \(D\) are the numbers of her wins, losses and draws respectively, justify briefly the following claims:

  1. \(W + L + D\) has variance \(0\,\);
  2. \(W + L\) has a binomial distribution.
Find the value of \(\displaystyle {\mathrm{Cov}(W,L) \over \sqrt{\mathrm{Var}(W) \mathrm{Var}(L)}}\;\).


Solution: \begin{align*} && \var[X+Y] &= \E\left [(X+Y-\E[X+Y])^2 \right] \\ &&&= \E \left [ (X - \E[X] + Y - \E[Y])^2 \right] \\ &&&= \E \left [(X - \E[X])^2 + (Y-\E[Y])^2 + 2(X-\E[X])(Y-\E[Y]) \right] \\ &&&= \E \left [(X - \E[X])^2 \right]+\E \left [(Y-\E[Y])^2 \right]+\E \left [2(X-\E[X])(Y-\E[Y]) \right] \\ &&&= \var[X] + \var[Y] + 2 \mathrm{Cov}(X,Y) \end{align*}

  1. \(W+L+D = m\) where \(m\) is the number of games, which has variance \(0\). Therefore \(W+L+D\) has variance \(0\).
  2. The probability of a decisive game is \(\frac23\) and \(W+L\) is the number of decisive games. Each game is independent so this meets the criteria for a binomial distribution.
Notice \(W+L \sim B(m, \tfrac23)\) and \(W, L, D \sim B(m, \tfrac13)\), in particular \(\var[W+L] = m \tfrac23 \tfrac13 = \tfrac29m\) and \(\var[W] = \var[D] = \var[D] = m \tfrac13 \tfrac13 = \tfrac29m\) \begin{align*} && \var[W+L] &= \var[W] + \var[L] + 2\mathrm{Cov}(W,L) \\ \Rightarrow && \mathrm{Cov}(W,L) &= -\tfrac19m \\ \Rightarrow && \frac{\mathrm{Cov}(W,L) }{\sqrt{\var[W]\var[L]}} &= -\frac12 \end{align*}

2001 Paper 1 Q13
D: 1500.0 B: 1500.0

Four students, one of whom is a mathematician, take turns at washing up over a long period of time. The number of plates broken by any student in this time obeys a Poisson distribution, the probability of any given student breaking \(n\) plates being \(\e^{-\lambda} \lambda^n/n!\) for some fixed constant \(\lambda\), independent of the number of breakages by other students. Given that five plates are broken, find the probability that three or more were broken by the mathematician.


Solution: Let \(X\) be the number of plates broken by the mathematician and \(Y\) by the other student. Then \(X \sim Po(\lambda), Y \sim Po(3\lambda)\) and \(X+Y \sim Po(4\lambda)\) \begin{align*} && \mathbb{P}(X = k | X+Y = n) &= \frac{\mathbb{P}(X = k, Y = n-k)}{\mathbb{P}(X+Y=n)} \\ &&&= \frac{e^{-\lambda} \lambda^k/k! \cdot e^{-3\lambda} (4\lambda)^{n-k}/(n-k)!}{e^{-4\lambda}(4\lambda)^n/n!} \\ &&&= \binom{n}{k} \left ( \frac{1}{4} \right)^k \left ( \frac{3}{5} \right)^{n-k} \end{align*} Therefore \(X | X+Y = n \sim Binomial(n, \tfrac14)\) \begin{align*} \mathbb{P}(X \geq 3 | X + Y = n) &= \binom{5}{3} \frac{3^2}{4^5} + \binom{5}{4} \frac{3}{4^5} + \binom{5}{5} \frac{1}{4^5} \\ &= \frac{1}{4^5} \left ( 90+ 15 + 1 \right) \\ &= \frac{106}{4^5} = \frac{53}{512} \approx \frac1{10} \end{align*}

2001 Paper 2 Q12
D: 1600.0 B: 1484.0

The national lottery of Ruritania is based on the positive integers from \(1\) to \(N\), where \(N\) is very large and fixed. Tickets cost \(\pounds1\) each. For each ticket purchased, the punter (i.e. the purchaser) chooses a number from \(1\) to \(N\). The winning number is chosen at random, and the jackpot is shared equally amongst those punters who chose the winning number. A syndicate decides to buy \(N\) tickets, choosing every number once to be sure of winning a share of the jackpot. The total number of tickets purchased in this draw is \(3.8N\) and the jackpot is \(\pounds W\). Assuming that the non-syndicate punters choose their numbers independently and at random, find the most probable number of winning tickets and show that the expected net loss of the syndicate is approximately \[ N\; - \; %\textstyle{ \frac{5 \big(1- e^{-2.8}\big)}{14} \;W\;. \]

2000 Paper 1 Q13
D: 1484.0 B: 1484.7

Every person carries two genes which can each be either of type \(A\) or of type \(B\). It is known that \(81\%\) of the population are \(AA\) (i.e. both genes are of type \(A\)), \(18\%\) are \(AB\) (i.e. there is one gene of type \(A\) and one of type \(B\)) and \(1\%\) are \(BB\). A child inherits one gene from each of its parents. If one parent is \(AA\), the child inherits a gene of type \(A\) from that parent; if the parent is \(BB\), the child inherits a gene of type \(B\) from that parent; if the parent is \(AB\), the inherited gene is equally likely to be \(A\) or \(B\).

  1. Given that two \(AB\) parents have four children, show that the probability that two of them are \(AA\) and two of them are \(BB\) is \(3/128\).
  2. My mother is \(AB\) and I am \(AA\). Find the probability that my father is \(AB\).

2000 Paper 3 Q12
D: 1700.0 B: 1553.7

In a lottery, any one of \(N\) numbers, where \(N\) is large, is chosen at random and independently for each player by machine. Each week there are \(2N\) players and one winning number is drawn. Write down an exact expression for the probability that there are three or fewer winners in a week, given that you hold a winning ticket that week. Using the fact that $$ {\biggl( 1 - {a \over n} \biggr) ^n \approx \e^{-a}}$$ for \(n\) much larger than \(a\), or otherwise, show that this probability is approximately \({2 \over 3}\) . Discuss briefly whether this probability would increase or decrease if the numbers were chosen by the players. Show that the expected number of winners in a week, given that you hold a winning ticket that week, is \( 3-N^{-1}\).

1999 Paper 1 Q13
D: 1500.0 B: 1484.0

Bar magnets are placed randomly end-to-end in a straight line. If adjacent magnets have ends of opposite polarities facing each other, they join together to form a single unit. If they have ends of the same polarity facing each other, they stand apart. Find the expectation and variance of the number of separate units in terms of the total number \(N\) of magnets.


Solution: There are \(N-1\) gaps between the magnets which are independently gaps or not gaps. Therefore the total number of gaps is \(X \sim Binomial(N-1, \frac12)\) and \begin{align*} \mathbb{E}(X) &= \frac{N-1}{2} \\ \textrm{Var}(X) &= \frac{N-1}{4} \end{align*}

1999 Paper 3 Q13
D: 1700.0 B: 1484.0

The cakes in our canteen each contain exactly four currants, each currant being randomly placed in the cake. I take a proportion \(X\) of a cake where \(X\) is a random variable with density function \[{\mathrm f}(x)=Ax\] for \(0\leqslant x\leqslant 1\) where \(A\) is a constant.

  1. What is the expected number of currants in my portion?
  2. If I find all four currants in my portion, what is the probability that I took more than half the cake?