Problems

Filters
Clear Filters

12 problems found

2022 Paper 2 Q11
D: 1500.0 B: 1500.0

A batch of \(N\) USB sticks is to be used on a network. Each stick has the same unknown probability \(p\) of being infected with a virus. Each stick is infected, or not, independently of the others. The network manager decides on an integer value of \(T\) with \(0 \leqslant T < N\). If \(T = 0\) no testing takes place and the \(N\) sticks are used on the network, but if \(T > 0\), the batch is subject to the following procedure.

  • Each of \(T\) sticks, chosen at random from the batch, undergoes a test during which it is destroyed.
  • If any of these \(T\) sticks is infected, all the remaining \(N - T\) sticks are destroyed.
  • If none of the \(T\) sticks is infected, the remaining \(N - T\) sticks are used on the network.
If any stick used on the network is infected, the network has to be disinfected at a cost of \(\pounds D\), where \(D > 0\). If no stick used on the network is infected, there is a gain of \(\pounds 1\) for each of the \(N - T\) sticks. There is no cost to testing or destroying a stick.
  1. Find an expression in terms of \(N\), \(T\), \(D\) and \(q\), where \(q = 1 - p\), for the expected net loss.
  2. Let \(\alpha = \dfrac{DT}{N(N - T + D)}\). Show that \(0 \leqslant \alpha < 1\). Show that, for fixed values of \(N\), \(D\) and \(T\), the greatest value of the expected net loss occurs when \(q\) satisfies the equation \(q^{N-T} = \alpha\). Show further that this greatest value is \(\pounds\dfrac{D(N-T)\,\alpha^k}{N}\), where \(k = \dfrac{T}{N-T}\).
  3. For fixed values of \(N\) and \(D\), show that there is some \(\beta > 0\) so that for all \(p < \beta\), the expression for the expected loss found in part (i) is an increasing function of \(T\). Deduce that, for small enough values of \(p\), testing no sticks minimises the expected net loss.

2008 Paper 3 Q12
D: 1700.0 B: 1516.0

Let \(X\) be a random variable with a Laplace distribution, so that its probability density function is given by \[ \f(x) = \frac12 \e^{-\vert x \vert }\;, \text{ \(-\infty < x < \infty \)}. \tag{\(*\)} \] Sketch \(\f(x)\). Show that its moment generating function \({\rm M}_X(\theta)\) is given by \({\rm M}_X(\theta)= (1-\theta^2)^{-1}\) and hence find the variance of \(X\). A frog is jumping up and down, attempting to land on the same spot each time. In fact, in each of \(n\) successive jumps he always lands on a fixed straight line but when he lands from the \(i\)th jump (\(i=1\,,2\,,\ldots\,,n\)) his displacement from the point from which he jumped is \(X_i\,\)cm, where \(X_i\) has the distribution \((*)\). His displacement from his starting point after \(n\) jumps is \(Y\,\)cm (so that \(Y=\sum\limits_{i=1}^n X_i\)). Each jump is independent of the others. Obtain the moment generating function for \(Y/ \sqrt {2n}\) and, by considering its logarithm, show that this moment generating function tends to \(\exp(\frac12\theta^2)\) as \(n\to\infty\). Given that \(\exp(\frac12\theta^2)\) is the moment generating function of the standard Normal random variable, estimate the least number of jumps such that there is a \(5\%\) chance that the frog lands 25 cm or more from his starting point.


Solution:

TikZ diagram
\begin{align*} && M_X(\theta) &= \E \left [ e^{\theta X} \right] \\ &&&= \int_{-\infty}^{\infty} e^{\theta x} f(x) \d x \\ &&&= \int_{-\infty}^0 e^{\theta x}\frac12 e^{x} \d x+ \int_0^{\infty} e^{\theta x} \frac12 e^{-x} \d x \\ &&&= \frac12 \left [ \frac{1}{1+\theta}e^{(1+\theta)x} \right]_{-\infty}^0 +\frac12 \left [ \frac{1}{\theta-1}e^{(\theta-1)x} \right]_{0}^{\infty} \\ &&&= \frac12 \left ( \frac{1}{1+ \theta} + \frac{1}{1-\theta} \right) \\ &&&= \frac{1}{1-\theta^2} = (1-\theta^2)^{-1} \\ &&&= 1 + \theta^2 + \theta^4 + \cdots \\ \\ \Rightarrow && \E[X] &= 0 \\ && \E[X^2] &= 2 \\ \Rightarrow && \var[X] &= 2 \end{align*} \begin{align*} && M_{Y/\sqrt{2n}}(\theta) &= \E \left [ \exp \left ( \theta \frac{Y}{\sqrt{2n}} \right) \right] \\ &&&= \E \left [ \exp \left ( \frac{\theta }{\sqrt{2n}} \sum_{i=1}^n X_i \right) \right] \\ &&&= \E \left [ \prod_{i=1}^n \exp \left ( \frac{\theta }{\sqrt{2n}} X_i \right) \right] \\ &&&= \prod_{i=1}^n \E \left [\exp \left ( \frac{\theta }{\sqrt{2n}} X_i \right) \right] \tag{independence}\\ &&&= \prod_{i=1}^n M_{X_i} \left ( \frac{\theta }{\sqrt{2n}} \right)\\ &&&= \prod_{i=1}^n M_{X} \left ( \frac{\theta }{\sqrt{2n}} \right)\\ &&&= M_{X} \left ( \frac{\theta }{\sqrt{2n}} \right)^n\\ &&&= \left (1 - \frac{\theta^2}{2n} \right)^{-n} \to \exp(\tfrac12 \theta^2) \end{align*} Given that \(M_{Y/\sqrt{2n}} \to M_Z\) we assume that \(Y/\sqrt{2n} \to Z\) or \(Y/\sqrt{2n} \approx Z\). \begin{align*} && 5\% &\approx \mathbb{P}(|Z| > 2) \\ &&&\approx \mathbb{P} \left (|Y| > 2\sqrt{2n} \right) \end{align*} So we wish to choose \(n\) such that \(2\sqrt{2n} = 25\) or \(n = \frac{625}8 \approx 78\) so take \(n = 79\)

2005 Paper 2 Q13
D: 1600.0 B: 1500.0

The number of printing errors on any page of a large book of \(N\) pages is modelled by a Poisson variate with parameter \(\lambda\) and is statistically independent of the number of printing errors on any other page. The number of pages in a random sample of \(n\) pages (where \(n\) is much smaller than \(N\) and \(n\ge2\)) which contain fewer than two errors is denoted by \(Y\). Show that \(\P(Y=k) = \binom n k p^kq^{n-k}\) where \(p=(1+\lambda)e^{-\lambda}\) and \(q=1-p\,\). Show also that, if \(\lambda\) is sufficiently small,

  1. \(q\approx \frac12 \lambda^2\,\);
  2. the largest value of \(n\) for which \(\P(Y=n)\ge 1-\lambda\) is approximately \(2/\lambda\,\);
  3. \(\P(Y>1 \;\vert\; Y>0) \approx 1-n(\lambda^2/2)^{n-1}\;.\)


Solution: First notice that the the probability a page contains fewer than two errors is \(\mathbb{P}(X < 2)\) where \(X \sim Po(\lambda)\), ie \(\mathbb{P}(X<2) = e^{-\lambda} + \lambda e^{-\lambda} = (1+\lambda)e^{-\lambda}\). Therefore the number of pages \(Y\) with fewer than two errors out of our sample of \(n\) is \(Bin(n, p)\) where \(p\) is as before. ie \(\mathbb{P}(Y = k) = \binom{n}{k} p^kq^{n-k}\).

  1. \(\,\) \begin{align*} && q &= 1- p = 1-(1+\lambda)e^{-\lambda} \\ &&&= 1 - (1+ \lambda)(1 - \lambda + \tfrac12 \lambda^2 + o(\lambda^3)) \\ &&&= 1 - 1+ \lambda - \lambda+\lambda^2 - \tfrac12 \lambda^2 + o(\lambda^3) \\ &&&= \tfrac12 \lambda^2 + o(\lambda^3) \end{align*}
  2. \(\,\) \begin{align*} && \mathbb{P}(Y = n) &= p^n \\ &&&= (1+\lambda)^ne^{-\lambda n} \\ &&&= (1 + n \lambda + \frac{n(n-1)}{2} \lambda^2 + \cdots)(1 - \lambda n + \frac{\lambda^2 n^2}{2} + \cdots) \\ &&&= 1 + 0 \lambda + \left ( \frac{n(n-1)}{2} + \frac{n^2}{2} - n^2 \right) \lambda^2 + o(\lambda^3) \\ &&&= 1 - \frac{n}{2} \lambda^2 + o(\lambda^3) \end{align*} So if \(\frac{n}{2} \lambda \leq 1\) or \(n \leq \frac{2}{\lambda}\) \(\mathbb{P}(Y = n) \leq 1- \lambda\).
  3. \(\,\) \begin{align*} && \mathbb{P}(Y > 1 | Y > 0) &= \frac{1-(q^n + npq^{n-1})}{1-q^n} \\ &&&= 1 - \frac{npq^{n-1}}{1-q^n} \\ &&&= 1 -n \frac{(1+ \lambda)e^{-\lambda} (\tfrac12 \lambda^2 + o(\lambda^3))^{n-1}}{1-(\tfrac12 \lambda^2 + o(\lambda^3))^n} \\ &&&= 1 - n \left (\frac{\lambda^2}{2} \right)^{n-1} \frac{(1+ \lambda)(1-\lambda + \lambda^2/2 - \cdots)(1+o(\lambda)^{n-1}}{1-(\tfrac12 \lambda^2 + o(\lambda^3))^n} \\ &&&= 1 - n \left (\frac{\lambda^2}{2} \right)^{n-1} (1 + o(\lambda)) \\ &&&\approx 1 - n \left (\frac{\lambda^2}{2} \right)^{n-1} \end{align*}

2004 Paper 1 Q13
D: 1500.0 B: 1458.1

  1. Three real numbers are drawn independently from the continuous rectangular distribution on \([ 0, 1 ]\,\). The random variable \(X\) is the maximum of the three numbers. Show that the probability that \(X \le 0.8\) is \(0.512\,\), and calculate the expectation of \(X\).
  2. \(N\) real numbers are drawn independently from a continuous rectangular distribution on \([ 0, a ]\,\). The random variable \(X\) is the maximum of the \(N\) numbers. A hypothesis test with a significance level of 5\% is carried out using the value, \(x\), of \(X \). The null hypothesis is that \(a=1\) and the alternative hypothesis is that \(a<1 \,\). The form of the test is such that \(H_0\) is rejected if \(x < c\,\), for some chosen number \(c\,\). Using the approximation \(2^{10} \approx 10^3\,\), determine the smallest integer value of \(N\) such that if \(x \le 0.8\) the null hypothesis will be rejected. With this value of \(N\), write down the probability that the null hypothesis is rejected if \(a = 0.8\,\), and find the probability that the null hypothesis is rejected if \(a = 0.9\,\).


Solution: \begin{align*} \P(X \leq 0.8) &= \P(X_1 \leq 0.8,X_2 \leq 0.8,X_3 \leq 0.8) \\ &= 0.8^3 \\ &= 0.512 \end{align*} \begin{align*} && \P(X < c) &= c^3 \\ \Rightarrow && f_X(x) &= 3x^2 \\ \Rightarrow && \E[X] &= \int_0^1 x \cdot (3x^2) \, dx \\ && &= \left [ \frac{3}{4}x^4 \right]_0^1 \\ &&&= \frac{3}{4} \end{align*} \(X\) is distributed the maximum of \(N\) numbers on \([0,a]\). \begin{align*} H_0 : & x= 1 \\ H_1 : & x < 1 \end{align*} \begin{align*} &&\P(X < c) &= c^N \\ &&&= \frac1{20} \\ \Rightarrow && N &= -\frac{\log(20)}{\log(c)} \end{align*} where \(c = 0.8\), we have \begin{align*} N &= \frac{\log(20)}{\log(5/4)} \\ &= \frac{\log(5)+\log(4)}{\log(5)-\log(4)} \\ &= \frac{ \frac{\log(5)}{\log(4)}+1}{\frac{\log(5)}{\log(4)} - 1} \end{align*} \begin{align*} && 2^{10} &\approx 10^{3} \\ && 10\log(2) &\approx 3 (\log(5) + \log(2)) \\ && 7\log(2) &\approx 3 \log(5) \\ && \frac{\log(5)}{2\log(2)} &\approx \frac{7}{6} \end{align*} \begin{align*} &= \frac{ \frac{\log(5)}{\log(4)}+1}{\frac{\log(5)}{\log(4)} - 1} &= \frac{\frac{7}{6} + 1}{\frac{7}{6} -1} \\ &= 13 \end{align*} Since \(2^{10} > 10^3\) then \(N=14\) is the value we seek. \(\P(X < 0.8 | a= 0.8) = 1\) \(\P(X < 0.8 | a= 0.9, N=14) = \frac{8^{14}}{9^{14}}\)

2004 Paper 3 Q14
D: 1700.0 B: 1488.4

In this question, \(\Phi(z)\) is the cumulative distribution function of a standard normal random variable. A random variable is known to have a Normal distribution with mean \(\mu\) and standard deviation either \(\sigma_0\) or \(\sigma_1\), where \(\sigma_0 < \sigma_1\,\). The mean, \(\overline{X}\), of a random sample of \(n\) values of \(X\) is to be used to test the hypothesis \(\mathrm{H}_0: \sigma = \sigma_0\) against the alternative \(\mathrm{H}_1: \sigma = \sigma_1\,\). Explain carefully why it is appropriate to use a two sided test of the form: accept \(\mathrm{H}_0\) if \(\mu - c < \overline{X} < \mu+c\,\), otherwise accept \(\mathrm{H}_1\). Given that the probability of accepting \(\mathrm{H}_1\) when \(\mathrm{H}_0\) is true is \(\alpha\), determine \(c\) in terms of \(n\), \(\sigma_0\) and \(z_{\alpha}\), where \(z_\alpha \) is defined by \(\displaystyle\Phi(z_{\alpha}) = 1 - \tfrac{1}{2}\alpha\). The probability of accepting \(\mathrm{H}_0\) when \(\mathrm{H}_1\) is true is denoted by \(\beta\). Show that \(\beta\) is independent of \(n\). Given that \(\Phi(1.960)\approx 0.975\) and that \(\Phi(0.063) \approx 0.525\,\), determine, approximately, the minimum value of \(\displaystyle \frac{\sigma_1}{\sigma_0}\) if \(\alpha\) and \(\beta\) are both to be less than \(0.05\,\).


Solution: If \(\sigma\) is smaller we should expect our sample to have a mean closer to the true mean. Therefore we should use a two sided test which accepts \(\mathrm{H}_0\) if the mean is very close to the true mean. Suppose \(\textrm{H}_0\) is true, ie \(\sigma = \sigma_0\), then note that \(X \sim N(\mu, \frac{\sigma_0^2}{n})\) \begin{align*} && 1-\alpha &= \mathbb{P}(\mu - c < X < \mu + c) \\ &&&= \mathbb{P}(\mu - c < \frac{\sigma_0}{\sqrt{n}} Z + \mu < \mu + c) \\ &&&= \mathbb{P}(- \frac{c\sqrt{n}}{\sigma_0} < Z<\frac{\sqrt{n}c}{\sigma_0}) \\ &&&= \mathbb{P}(Z<\frac{\sqrt{n}c}{\sigma_0}) -\mathbb{P}( Z<-\frac{\sqrt{n}c}{\sigma_0}) \\ &&&= \mathbb{P}(Z<\frac{\sqrt{n}c}{\sigma_0}) -(1-\mathbb{P}( Z<\frac{\sqrt{n}c}{\sigma_0})) \\ &&&= 2\mathbb{P}(Z<\frac{\sqrt{n}c}{\sigma_0})-1 \\ \Rightarrow && \Phi(\frac{\sqrt{n}c}{\sigma_0})&=1 - \tfrac12 \alpha \\ \Rightarrow && \frac{\sqrt{n}c}{\sigma_0} &= z_{\alpha} \\ && c &= \frac{\sigma_0 z_{\alpha}}{\sqrt{n}} \end{align*} Under \(\mathrm{H}_1\), \(\sigma = \sigma_1\) so \begin{align*} && \beta &= \mathbb{P}(\mu - c < X < \mu + c) \\ &&&= \mathbb{P}(-\frac{c\sqrt{n}}{\sigma_1} < Z < \frac{\sqrt{n}c}{\sigma_1}) \\ &&&= \mathbb{P}(-\frac{\sigma_0}{\sigma_1} z_{\alpha}< Z < \frac{\sigma_0}{\sigma_1} z_{\alpha}) \\ &&&= 2\Phi(\frac{\sigma_0}{\sigma_1} z_{\alpha})-1 \end{align*} which does not depend on \(n\). Suppose both \(\alpha<0.05\) and \(\beta<0.05\), then \(z_{\alpha} > 1.96\) and \(\Phi(\frac{\sigma_0}{\sigma_1}1.96)<0.525 \Rightarrow \frac{\sigma_0}{\sigma_1}1.96 < 0.063 \Rightarrow \frac{\sigma_1}{\sigma_0} > \frac{1.96}{0.063} = 31.1 \) so the ratio of variances needs to be larger than \(31.1\).

2003 Paper 2 Q14
D: 1600.0 B: 1484.8

The probability of throwing a 6 with a biased die is \(p\,\). It is known that \(p\) is equal to one or other of the numbers \(A\) and \(B\) where \(0 < A < B < 1 \,\). Accordingly the following statistical test of the hypothesis \(H_0: \,p=B\) against the alternative hypothesis \(H_1: \,p=A\) is performed. The die is thrown repeatedly until a 6 is obtained. Then if \(X\) is the total number of throws, \(H_0\) is accepted if \(X \le M\,\), where \(M\) is a given positive integer; otherwise \(H_1\) is accepted. Let \({\alpha}\) be the probability that \(H_1\) is accepted if \(H_0\) is true, and let \({\beta}\) be the probability that \(H_0\) is accepted if \(H_1\) is true. Show that \({\beta} = 1- {\alpha}^K,\) where \(K\) is independent of \(M\) and is to be determined in terms of \(A\) and \(B\,\). Sketch the graph of \({\beta}\) against \({\alpha}\,\).


Solution: \(X \sim Geo(p)\). \(\alpha = \mathbb{P}(X > M | p = B) = (1-B)^{M}\) \(\beta = \mathbb{P}(X \leq M | p = A) = 1 - \mathbb{P}(X > M | p = A) = 1 - (1-A)^{M}\) \begin{align*} \ln \alpha &= M \ln(1-B) \\ \ln (1-\beta) &= M \ln(1-A) \\ \frac{\ln \alpha}{\ln (1-\beta)} &= \frac{\ln(1-B)}{\ln(1-A)} \\ \ln(1-\beta) &= \ln \alpha \frac{\ln (1-A)}{\ln(1-B)} \\ \beta &= 1- \alpha^{ \frac{\ln (1-A)}{\ln(1-B)} } \end{align*} and \(K = \frac{\ln (1-A)}{\ln(1-B)} \) Since \(0 < A < B < 1\) we must have that \(0 < 1 - B < 1-A < 1\) and \(\ln(1-B) < \ln(1-A) < 0\) so \(0 < K < 1\)

TikZ diagram

1995 Paper 2 Q14
D: 1600.0 B: 1500.0

Suppose \(X\) is a random variable with probability density \[ \mathrm{f}(x)=Ax^{2}\exp(-x^{2}/2) \] for \(-\infty < x < \infty.\) Find \(A\). You belong to a group of scientists who believe that the outcome of a certain experiment is a random variable with the probability density just given, while other scientists believe that the probability density is the same except with different mean (i.e. the probability density is \(\mathrm{f}(x-\mu)\) with \(\mu\neq0\)). In each of the following two cases decide whether the result given would shake your faith in your hypothesis, and justify your answer.

  1. A single trial produces the result 87.3.
  2. 1000 independent trials produce results having a mean value \(0.23.\)
{[}Great weight will be placed on clear statements of your reasons and none on the mere repetition of standard tests, however sophisticated, if unsupported by argument. There are several possible approaches to this question. For some of them it is useful to know that if \(Z\) is normal with mean 0 and variance 1 then \(\mathrm{E}(Z^{4})=3.\){]}


Solution: Let \(Z \sim N(0,1)\), with a pdf of \(f(x) = \frac{1}{\sqrt{2\pi}} \exp(-x^2/2)\) \begin{align*} && 1 &= \int_{-\infty}^\infty Ax^2 \exp(-x^2/2) \d x \\ &&&= A\sqrt{2\pi} \int_{-\infty}^\infty x^2 \frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x \\ &&&= A\sqrt{2\pi} \E[Z^2] = A\sqrt{2\pi} \\ \Rightarrow && A &= \frac{1}{\sqrt{2\pi}} \end{align*}

  1. The probability of seeing a result as extreme as \(87.3\) is \begin{align*} \mathbb{P}(X > 87.3) &= \frac{1}{\sqrt{2\pi}}\int_{87.3}^{\infty} x^2 \exp(-x^2/2) \d x \\ &= \left [ -\frac{1}{\sqrt{2\pi}}x \exp(-x^2/2)\right]_{87.3}^{\infty}+\int_{87.3}^{\infty}\frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x \\ &\approx 0 +(1- \Phi(87.3)) \\ &\approx 0 \end{align*} It is very unlikely this data point has come from our distribution rather than one with a higher mean, therefore our faith is very shaken.
  2. If there are 1000 trials of this, we would expect the sample mean to be distributed according to the CLT. Each sample has mean \(0\) and variance \(\E[X^2] = \int_{-\infty}^\infty x^4 \frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x = \E[Z^4] = 3\), therefore the sample mean is \(N(0, 3/1000)\). Therefore the probability of being \(0.23\) away is \begin{align*} && \mathbb{P}(S > 0.23) &= \mathbb{P}\left (Z > \frac{0.23}{\sqrt{3/1000}} \right) \\ &&&= \mathbb{P}\left (Z > \frac{0.23}{\sqrt{30}/100} \right) \\ &&&\approx \mathbb{P}\left (Z > \frac{0.23}{0.055} \right) \\ &&& \approx 0 \end{align*} again our faith should be shaken

1992 Paper 1 Q14
D: 1500.0 B: 1484.8

The average number of pedestrians killed annually in road accidents in Poldavia during the period 1974-1989 was 1080 and the average number killed annually in commercial flight accidents during the same period was 180. Discuss the following newspaper headlines which appeared in 1991. (The percentage figures in square brackets give a rough indication of the weight of marks attached to each discussion.)

  1. [\(10\%\)] Six Times Safer To Fly Than To Walk. 1974-1989 Figures Prove It.
  2. [\(10\%\)] Our Skies Are Safer. Only 125 People Killed In Air Accidents In 1990.
  3. [\(30\%\)] Road Carnage Increasing. 7 People Killed On Tuesday.
  4. [\(50\%\)] Alarming Rise In Pedestrian Casualties. 1350 Pedestrians Killed In Road Accidents During 1990.


Solution:

  1. We cannot say this, since we do not know how many people were flying or walking each year.
  2. This is difficult to say without knowing the variance. We might expect this to have quite a skewed distribution (one big air crash causes lots of deaths infrequently) so it's impossible to know, although it is substantially lower.
  3. If we have 1080 deaths annually, we should expect ~3 deaths per day. While a day with \(7\) deaths might seem unlikely, over the course of a year it is very likely to occur. (Perhaps the weather was bad). It is also probably a case of selective reporting, we are seeing this data point because it's notable and being reported rather than because it is significant).
  4. This is certainly the most alarming, a ~25% increase is very unlikely without something else going on. (We'd expect it to be ~Po(1080) approximalely N(1080, 1080) but then this is many standard deviations away). However we also know that other factors could drive this (more walking, more people, change in reporting etc)

1990 Paper 1 Q16
D: 1500.0 B: 1486.1

A bus is supposed to stop outside my house every hour on the hour. From long observation I know that a bus will always arrive some time between 10 minutes before and ten minutes after the hour. The probability it arrives at a given instant increases linearly (from zero at 10 minutes before the hour) up to a maximum value at the hour, and then decreases linearly at the same rate after the hour. Obtain the probability density function of \(T\), the time in minutes after the scheduled time at which a bus arrives. If I get up when my alarm clock goes off, I arrive at the bus stop at 7.55am. However, with probability 0.5, I doze for 3 minutes before it rings again. In that case with probability 0.8 I get up then and reach the bus stop at 7.58am, or, with probability 0.2, I sleep a little longer, not reaching the stop until 8.02am. What is the probability that I catch a bus by 8.10am? I buy a louder alarm clock which ensures that I reach the stop at exactly the same time each morning. This clock keeps perfect time, but may be set to an incorrect time. If it is correct, the alarm goes off so that I should reach the stop at 7.55am. After 100 mornings I find that I have had to wait for a bus until after 9am (according to the new clock) on 5 occasions. Is this evidence that the new clock is incorrectly set? {[}The time of arrival of different buses are independent of each other.{]}


Solution: The probability density function will look like a triangle with base \(20\) minutes and therefore height \(\frac{1}{10}\) per minute, ie: \begin{align*} f_T(t) &= \begin{cases} \frac{1}{100}(t+10) & \text{if } -10 \leq t \leq 0 \\ \frac{1}{100}(10-t) & \text{if } 0 \leq t \leq 10 \\ 0 & \text{otherwise} \end{cases} \end{align*} \begin{align*} \mathbb{P}(\text{catch bus}) &=0.5 \mathbb{P}(\text{bus arrives after 7:55})+0.4 \mathbb{P}(\text{bus arrives after 7:58}) + 0.1 \mathbb{P}(\text{bus arrives after 8:02}) \\ &= \frac12 \cdot \left (1 - \frac18 \right) + \frac{2}{5} \cdot \left ( 1 - \frac{4^2}{5^2} \cdot \frac{1}{2} \right) + \frac{1}{10} \cdot \frac{4^2}{5^2} \cdot \frac12 \\ &= \frac{1\,483}{2\,000} \\ &\approx 74\% \end{align*} \begin{align*} \mathbb{P}(\text{catch bus}) &= \mathbb{P}(\text{bus arrives after 7:55}) \mathbb{P}(\text{catch next bus by 9:00}) \\ &= \frac78 + \frac18 \cdot \frac12 \\ &= \frac{15}{16} \end{align*} He should expect to miss \(6.25\) buses, so missing \(5\) seems about right. (Using a binomial calculation, seeing 5 or fewer buses is ~\(40\%\) which isn't suspicious).

1989 Paper 1 Q14
D: 1516.0 B: 1453.5

The prevailing winds blow in a constant southerly direction from an enchanted castle. Each year, according to an ancient tradition, a princess releases 96 magic seeds from the castle, which are carried south by the wind before falling to rest. South of the castle lies one league of grassy parkland, then one league of lake, then one league of farmland, and finally the sea. If a seed falls on land it will immediately grow into a fever tree. (Fever trees do not grow in water). Seeds are blown independently of each other. The random variable \(L\) is the distance in leagues south of the castle at which a seed falls to rest (either on land or water). It is known that the probability density function \(\mathrm{f}\) of \(L\) is given by \[ \mathrm{f}(x)=\begin{cases} \frac{1}{2}-\frac{1}{8}x & \mbox{ for }0\leqslant x\leqslant4,\\ 0 & \mbox{ otherwise.} \end{cases} \] What is the mean number of fever trees which begin to grow each year?

  1. The random variable \(Y\) is defined as the distance in leagues south of the castle at which a new fever tree grows from a seed carried by the wind. Sketch the probability density function of \(Y\), and find the mean of \(Y\).
  2. One year messengers bring the king the news that 23 new fever trees have grown in the farmland. The wind never varies, and so the king suspects that the ancient tradition have not been followed properly. Is he justified in his suspicions?


Solution: \begin{align*} \mathbb{P}(\text{fever tree grows}) &= \mathbb{P}(0 \leq L \leq 1) + \mathbb{P}(2 \leq L \leq 3) \\ &= \int_0^1 \frac12 -\frac18 x \d x + \int_2^3 \frac12 - \frac18 x \d x \\ &= \left [\frac12 x - \frac1{16}x^2 \right]_0^1+ \left [\frac12 x - \frac1{16}x^2 \right]_2^3 \\ &= \frac12 - \frac1{16}+\frac32-\frac9{16} - 1 + \frac{4}{16} \\ &= \frac58 \end{align*} The expected number of fever trees is just \(96 \cdot \frac58 = 60\).

  1. \(f_Y(t)\) must match the distribution for \(L\), but limited to the points we care about, therefore it should be: $f_Y(t) = \begin{cases} ( \frac45 - \frac15t ) & \text{if } t \in [0,1]\cup[2,3] \\ 0 & \text{otherwise} \end{cases}$
    TikZ diagram
    \begin{align*} \mathbb{E}(Y) &= \frac12 \cdot \frac15 (4 - \frac12)+\frac52 \cdot (1 - \frac15 (4 - \frac12)) \\ &= \frac12 \cdot \frac7{10} + \frac52 \cdot \frac3{10} \\ &= \frac{22}{20} \\ &= \frac{11}{10} \end{align*}
  2. Given the seeds are blown independently and the wind hasn't changed, it is reasonable to model the number of fever trees as \(B(96, \frac{5}{8})\), it is also acceptable to approximate this using a Normal distribution, ie \(N(60, 22.5)\), \(23\) is \(\frac{23-60}{\sqrt{22.5}}\) is a very negative number, so he should be extremely suspicious.

1989 Paper 3 Q16
D: 1700.0 B: 1484.0

It is believed that the population of Ruritania can be described as follows:

  1. \(25\%\) are fair-haired and the rest are dark-haired;
  2. \(20\%\) are green-eyed and the rest hazel-eyed;
  3. the population can also be divided into narrow-headed and broad-headed;
  4. no narrow-headed person has green eyes and fair hair;
  5. those who are green-eyed are as likely to be narrow-headed as broad-headed;
  6. those who are green-eyed and broad-headed are as likely to be fair-headed as dark-haired;
  7. half of the population is broad-headed and dark-haired;
  8. a hazel-eyed person is as likely to be fair-haired and broad-headed as dark-haired and narrow-headed.
Find the proportion believed to be narrow-headed. I am acquainted with only six Ruritanians, all of whom are broad-headed. Comment on this observation as evidence for or against the given model. A random sample of 200 Ruritanians is taken and is found to contain 50 narrow-heads. On the basis of the given model, calculate (to a reasonable approximation) the probability of getting 50 or fewer narrow-heads. Comment on the result.


Solution:

TikZ diagram
Conditions tell us: \begin{align*} && a+b+d+e &= 0.25 \\ && b+c+e+f &= 0.2 \\ && e &= 0 \\ && b+c &= e + f \\ && b &= c \\ && c+h &= 0.5 \\ && a &= g \\ \end{align*}
TikZ diagram
So \(4b = 0.2 \Rightarrow b = 0.05\)
TikZ diagram
And \begin{align*} && 0.25 &= a + d + 0.05 \\ && 1 &= 2a + d + 0.65 \\ \Rightarrow && a &= 0.15 \\ && d &= 0.05 \end{align*}
TikZ diagram
So the proportion who are narrow-headed is \(30\%\). It's obviously relatively unlikely for your six Ruritanian friends to all be broad-headed if it's a random sample, but friendship groups are are likely to be biased so it's not too surprising. Assuming there is a sufficiently large number of Ruritanians, we might model the number of narrow-headed Ruritanians from a sample of \(200\) as \(X \sim B(200, 0.3)\). Computing \(\mathbb{P}(X \leq 50)\) by hand is tricky, so let's use a binomial approximation to obtain: \(X \approx N(60, 42)\) and \begin{align*} \mathbb{P}(X \leq 50) &\approx \mathbb{P} \left (Z \leq \frac{50 - 60+0.5}{\sqrt{42}} \right) \\ &\approx \mathbb{P} \left (Z \leq -\frac{9.5}{6.5} \right) \\ &\approx \mathbb{P} \left (Z \leq -\frac{3}{2} \right) \\ &\approx 5\% \end{align*} (actually this approximation gives \(7.1\%\) and the binomial value gives \(7.0\%\)). This also seems somewhat surprising

1988 Paper 1 Q15
D: 1500.0 B: 1484.0

In Fridge football, each team scores two points for a goal and one point for a foul committed by the opposing team. In each game, for each team, the probability that the team scores \(n\) goals is \(\left(3-\left|2-n\right|\right)/9\) for \(0\leqslant n\leqslant4\) and zero otherwise, while the number of fouls committed against it will with equal probability be one of the numbers from \(0\) to \(9\) inclusive. The numbers of goals and fouls of each team are mutually independent. What is the probability that in some game a particular team gains more than half its points from fouls? In response to criticisms that the game is boring and violent, the ruling body increases the number of penalty points awarded for a foul, in the hope that this will cause large numbers of fouls to be less probable. During the season following the rule change, 150 games are played and on 12 occasions (out of 300) a team committed 9 fouls. Is this good evidence of a change in the probability distribution of the number of fouls? Justify your answer.


Solution: \begin{array}{c|c|c|c} k & \P(k \text{ goals}) & \P(\geq 2k+1 \text{ fouls}) & \P(k \text{ goals and } \geq 2k+1 \text{ fouls}) \\ \hline 0 & \frac{3-|2|}{9} = \frac19 & \frac{9}{10} & \frac{9}{90}\\ 1 & \frac{3-|2-1|}{9} = \frac29 & \frac{7}{10} & \frac{14}{90} \\ 2 & \frac{3-|2-2|}{9} = \frac39 & \frac{5}{10} & \frac{15}{90} \\ 3 & \frac{3-|2-3|}{9} = \frac29 & \frac{3}{10} & \frac{6}{90} \\ 4 & \frac{3-|2-4|}{9} = \frac19 & \frac{1}{10} & \frac{1}{90} \\ \hline &&& \frac{9+14+15+6+1}{90} = \frac12 \end{array} The probability a team scores more than half its points from fouls is \(\frac12\). Letting \(X\) be the number of times a team committed \(9\) fouls, then \(X \sim B(300, p)\). Consider two hypotheses: \(H_0: p = \frac1{10}\) \(H_1: p < \frac1{10}\) Under \(H_0\), we are interested in \(\P(X \leq 9)\). Since \(300 \frac{1}{10} > 5\) it is appropriate to use a normal approximation, \(N(30, 27)\). Therefore, \begin{align*} && \P(X \leq 9) &\approx \P(3\sqrt{3}Z + 30 \leq 9.5) \\ &&&= \P( Z \leq \frac{9.5-30}{3\sqrt{3}}) \\ &&&= \P(Z \leq \frac{-20.5}{3\sqrt{3}}) \\ &&&< \P(Z \leq -\frac{7}{2}) \end{align*} Which is very small. Therefore there is good evidence to believe there has been a change in the number of fouls.