Problems

Filters
Clear Filters

17 problems found

2025 Paper 3 Q11
D: 1500.0 B: 1500.0

  1. Let \(\lambda > 0\). The independent random variables \(X_1, X_2, \ldots, X_n\) all have probability density function $$f(t) = \begin{cases} \lambda e^{-\lambda t} & t \geq 0 \\ 0 & t < 0 \end{cases}$$ and cumulative distribution function \(F(x)\). The value of random variable \(Y\) is the largest of the values \(X_1, X_2, \ldots, X_n\). Show that the cumulative distribution function of \(Y\) is given, for \(y \geq 0\), by $$G(y) = (1 - e^{-\lambda y})^n$$
  2. The values \(L(\alpha)\) and \(U(\alpha)\), where \(0 < \alpha \leq \frac{1}{2}\), are such that $$P(Y < L(\alpha)) = \alpha \text{ and } P(Y > U(\alpha)) = \alpha$$ Show that $$L(\alpha) = -\frac{1}{\lambda}\ln(1 - \alpha^{1/n})$$ and write down a similar expression for \(U(\alpha)\).
  3. Use the approximation \(e^t \approx 1 + t\), for \(|t|\) small, to show that, for sufficiently large \(n\), $$\lambda L(\alpha) \approx \ln(n) - \ln\left(\ln\left(\frac{1}{\alpha}\right)\right)$$
  4. Hence show that the median of \(Y\) tends to infinity as \(n\) increases, but that the width of the interval \(U(\alpha) - L(\alpha)\) tends to a value which is independent of \(n\).
  5. You are given that, for \(|t|\) small, \(\ln(1 + t) \approx t\) and that \(e^3 \approx 20\). Show that, for sufficiently large \(n\), there is an interval of width approximately \(4\lambda^{-1}\) in which \(Y\) lies with probability \(0.9\).


Solution:

  1. Note that \(\displaystyle F(y) = \mathbb{P}(X_i < y) = \int_0^y \lambda e^{-\lambda t} \d t = 1-e^{-\lambda y}\). Notice also that \begin{align*} G(y) &= \mathbb{P}(Y < y) \\ &= \mathbb{P}(\max_i(X_i) < y) \\ &= \mathbb{P}(X_i < y \text{ for all }i) \\ &= \prod_{i=1}^n \mathbb{P}(X_i < y) \\ &= \prod_{i=1}^n (1-e^{-\lambda y})\\ &= (1-e^{-\lambda y})^n \end{align*} as required.
  2. \begin{align*} && \mathbb{P}(Y < L(\alpha)) &= \alpha \\ \Rightarrow && (1-e^{-\lambda L(\alpha)})^n &= \alpha \\ \Rightarrow && 1-e^{-\lambda L(\alpha)} &= \alpha^{\tfrac1n} \\ \Rightarrow && L(\alpha) &= -\frac{1}{\lambda}\ln \left (1-\alpha^{\tfrac1n} \right) \end{align*} Notice also: \begin{align*} && \mathbb{P}(Y > U(\alpha)) &= \alpha \\ \Rightarrow && 1 - (1-e^{-\lambda U(\alpha)})^n &= \alpha \\ \Rightarrow && U(\alpha) &= -\frac{1}{\lambda}\ln \left ( 1-(1-\alpha)^{\tfrac1n} \right) \end{align*}
  3. \begin{align*} \lambda L(\alpha) &= -\ln \left (1-\alpha^{\tfrac1n} \right) \\ &= -\ln \left (1-e^{\tfrac1n \ln \alpha} \right) \\ &\approx - \ln \left ( 1 - 1 - \frac1n \ln \alpha\right) \tag{\(e^t \approx 1 + t\)} \\ &= -\ln \left ( \frac{1}{n} \ln \frac{1}\alpha \right) \\ &= - \ln \frac{1}{n} - \ln \left ( \ln \frac{1}{\alpha} \right )\\ &= \ln n - \ln \left ( \ln \left ( \frac{1}{\alpha} \right ) \right) \end{align*} since if \(n\) is large, \(\frac{\ln \alpha}{n}\) is small.
  4. The median is the value where \(\mathbb{P}(Y < M) = \frac12\), or in other words \(L(\frac12)\), but this is \(\approx \frac{\ln n - \ln (\ln 2)}{\lambda} \to \infty\). \begin{align*} && \lambda U(\alpha) &\approx \ln n - \ln \left ( \ln \left ( \frac{1}{1-\alpha} \right ) \right) \\ \Rightarrow && \lambda(U(\alpha) - L(\alpha)) &\approx -\ln \left ( \ln \left ( \frac{1}{1-\alpha} \right ) \right)+ \ln \left ( \ln \left ( \frac{1}{\alpha} \right ) \right) \\ \Rightarrow && U(\alpha) - L(\alpha) &\to \frac{1}{\lambda} \left ( \ln \left ( \ln \left ( \frac{1}{\alpha} \right ) \right)-\ln \left ( \ln \left ( \frac{1}{1-\alpha} \right ) \right ) \right) \end{align*} which doesn't depend on \(n\).
  5. Suppose \(\alpha = \frac{1}{20}\) then \begin{align*} U(\alpha) - L(\alpha) &\approx \frac{1}{\lambda} \left (\ln \ln 20 - \ln \ln \frac{20}{19} \right) \\ &= \lambda^{-1} \left (\ln \ln 20 - \ln \ln (1 + \frac{1}{19}) \right) \\ &\approx \lambda^{-1} \left (\ln 3 - \ln \frac{1}{19} \right) \tag{\(\ln(1+t) \approx t\)} \\ &\approx \lambda^{-1} \ln 3 \cdot 19 \\ &\approx \lambda^{-1} (1 + 3) \\ &\approx 4\lambda^{-1} \end{align*} [Note that \(\ln \ln 20 - \ln \ln \frac{20}{19} = 4.0673\ldots\)]

2024 Paper 3 Q12
D: 1500.0 B: 1500.0

  1. A point is chosen at random in the square \(0 \leqslant x \leqslant 1\), \(0 \leqslant y \leqslant 1\), so that the probability that a point lies in any region is equal to the area of that region. \(R\) is the random variable giving the distance of the point from the origin. Show that the cumulative distribution function of \(R\) is given by \[\mathrm{P}(R \leqslant r) = \sqrt{r^2 - 1} + \tfrac{1}{4}\pi r^2 - r^2 \cos^{-1}(r^{-1}),\] when \(1 \leqslant r \leqslant \sqrt{2}\). What is the cumulative distribution function when \(0 \leqslant r \leqslant 1\)?
  2. Show that \(\displaystyle\mathrm{E}(R) = \frac{2}{3}\int_1^{\sqrt{2}} \frac{r^2}{\sqrt{r^2-1}}\,\mathrm{d}r\).
  3. Show further that \(\mathrm{E}(R) = \frac{1}{3}\Bigl(\sqrt{2} + \ln\bigl(\sqrt{2}+1\bigr)\Bigr)\).

2023 Paper 2 Q12
D: 1500.0 B: 1500.0

Each of the independent random variables \(X_1, X_2, \ldots, X_n\) has the probability density function \(\mathrm{f}(x) = \frac{1}{2}\sin x\) for \(0 \leqslant x \leqslant \pi\) (and zero otherwise). Let \(Y\) be the random variable whose value is the maximum of the values of \(X_1, X_2, \ldots, X_n\).

  1. Explain why \(\mathrm{P}(Y \leqslant t) = \big[\mathrm{P}(X_1 \leqslant t)\big]^n\) and hence, or otherwise, find the probability density function of \(Y\).
Let \(m(n)\) be the median of \(Y\) and \(\mu(n)\) be the mean of \(Y\).
  1. Find an expression for \(m(n)\) in terms of \(n\). How does \(m(n)\) change as \(n\) increases?
  2. Show that \[\mu(n) = \pi - \frac{1}{2^n}\int_0^{\pi} (1-\cos x)^n\,\mathrm{d}x\,.\]
    1. Show that \(\mu(n)\) increases with \(n\).
    2. Show that \(\mu(2) < m(2)\).

2021 Paper 3 Q11
D: 1500.0 B: 1500.0

The continuous random variable \(X\) has probability density function \[ f(x) = \begin{cases} \lambda e^{-\lambda x} & \text{for } x \geqslant 0, \\ 0 & \text{otherwise,} \end{cases} \] where \(\lambda\) is a positive constant. The random variable \(Y\) is the greatest integer less than or equal to \(X\), and \(Z = X - Y\).

  1. Show that, for any non-negative integer \(n\), \[ \mathrm{P}(Y = n) = (1 - e^{-\lambda})\,e^{-n\lambda}. \]
  2. Show that \[ \mathrm{P}(Z < z) = \frac{1 - e^{-\lambda z}}{1 - e^{-\lambda}} \qquad \text{for } 0 \leqslant z \leqslant 1. \]
  3. Evaluate \(\mathrm{E}(Z)\).
  4. Obtain an expression for \[ \mathrm{P}(Y = n \text{ and } z_1 < Z < z_2), \] where \(0 \leqslant z_1 < z_2 \leqslant 1\) and \(n\) is a non-negative integer. Determine whether \(Y\) and \(Z\) are independent.


Solution:

  1. \(\,\) \begin{align*} && \mathbb{P}(Y = n) &= \mathbb{P}(X \in [n, n+1)) \\ &&&= \int_n^{n+1} \lambda e^{-\lambda x} \d x \\ &&&= \left [-e^{-\lambda x} \right]_n^{n+1} \\ &&&= e^{-\lambda n} - e^{-\lambda(n+1)} \\ &&&= e^{-\lambda n}(1- e^{-\lambda}) \end{align*}
  2. \(,\) \begin{align*} && \mathbb{P}(Z < z) &= \sum_{i=0}^{\infty} \mathbb{P}(X \in (n, n+z)) \\ &&&= \sum_{i=0}^{\infty} \int_{n}^{n+z} \lambda e^{-\lambda x} \d x \\ &&&= \sum_{i=0}^{\infty} [-e^{-\lambda x}]_{n}^{n+z} \\ &&&= \sum_{i=0}^{\infty} (1-e^{-\lambda x})e^{-\lambda n} \\ &&&= \frac{1-e^{-\lambda x}}{1-e^{-\lambda}} \end{align*}
  3. Give the cdf of \(Z\), we see that \(f_Z(z) = \frac{\lambda e^{-\lambda z}}{1-e^{-\lambda}}\) so \begin{align*} && \E[Z] &= \int_0^1 z \frac{\lambda e^{-\lambda z}}{1-e^{-\lambda}} \d z \\ &&&= \frac{\lambda}{1-e^{-\lambda}} \int_0^1 ze^{-\lambda z} \d z \\ &&&= \frac{\lambda}{1-e^{-\lambda}} \left ( \left [-\frac{1}{\lambda} ze^{-\lambda z} \right]_0^1+\int_0^1 \frac{1}{\lambda} e^{-\lambda z} \d z \right) \\ &&&= \frac{\lambda}{1-e^{-\lambda}} \left ( -\frac{e^{-\lambda}}{\lambda} + \frac{1-e^{-\lambda}}{\lambda^2} \right) \\ &&&= \frac{1-e^{-\lambda}(1+\lambda)}{\lambda (1-e^{-\lambda})} \end{align*}
  4. \(\,\) \begin{align*} && \mathbb{P}(Y = n \text{ and }z_1 < Z < z_2)&= \mathbb{P}(X \in (n+z_1, n+z_2) ) \\ &&&= \int_{n+z_1}^{n+z_2} \lambda e^{-\lambda x} \d x \\ &&&= e^{-n\lambda}(e^{-\lambda z_1} - e^{-\lambda z_2}) \end{align*} Note that \(\mathbb{P}(z_1 < Z < z_2) = \mathbb{P}( Z < z_2) -\mathbb{P}(Z< z_1) =\frac{e^{-\lambda z_1} - e^{-\lambda z_2}}{1-e^{-\lambda}}\) Therefore \begin{align*} && \mathbb{P}(Y = n \text{ and }z_1 < Z < z_2) &= e^{-n\lambda}(e^{-\lambda z_1} - e^{-\lambda z_2}) \\ &&&= e^{-\lambda n}(1-e^{-\lambda}) \frac{e^{-\lambda z_1} - e^{-\lambda z_2}}{1-e^{-\lambda}} \\ &&&= \mathbb{P}(Y=n) \mathbb{P}(z_1 < Z < z_2) \end{align*} So they are independent, which is to be expected from the memorylessness property of the exponential distribution.

2015 Paper 3 Q13
D: 1700.0 B: 1500.0

Each of the two independent random variables \(X\) and \(Y\) is uniformly distributed on the interval~\([0,1]\).

  1. By considering the lines \(x+y =\) \(\mathrm{constant}\) in the \(x\)-\(y\) plane, find the cumulative distribution function of \(X+Y\).
  2. Hence show that the probability density function \(f\) of \((X+Y)^{-1}\) is given by \[ \f(t) = \begin{cases} 2t^{-2} -t^{-3} & \text{for \( \tfrac12 \le t \le 1\)} \\ t^{-3} & \text{for \(1\le t <\infty\)}\\ 0 & \text{otherwise}. \end{cases} \] Evaluate \(\E\Big(\dfrac1{X+Y}\Big)\,\).
  3. Find the cumulative distribution function of \(Y/X\) and use this result to find the probability density function of \(\dfrac X {X+Y}\). Write down \(\E\Big( \dfrac X {X+Y}\Big)\) and verify your result by integration.


Solution:

  1. \(\mathbb{P}(X + Y \leq c) \) is the area between the \(x\)-axis, \(y\)-axis and the line \(x + y = c\). There are two cases for this: \[\mathbb{P}(X + Y \leq c) = \begin{cases} 0 & \text{ if } c \leq 0 \\ \frac{c^2}{2} & \text{ if } c \leq 1 \\ 1- \frac{(2-c)^2}{2} & \text{ if } 1 \leq c \leq 2 \\ 1 & \text{ otherwise} \end{cases}\]
  2. \begin{align*} && \mathbb{P}((X + Y)^{-1} \leq t) &= 1- \mathbb{P}(X + Y \leq \frac1{t}) \\ \Rightarrow && f_{(X+Y)^{-1}}(t) &= 0 -\begin{cases} 0 & \text{ if } \frac1{t} \leq 0 \\ \frac{\d}{\d t}\frac{1}{2t^2} & \text{ if } \frac{1}{t} \leq 1 \\ \frac{\d}{\d t} \l 1- \frac{(2-\frac1t)^2}{2} \r & \text{ if } 1 \leq \frac{1}{t} \leq 2 \\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ (2-\frac1t)t^{-2} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ 2t^{-2}-t^{-3} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \end{align*} Therefore, \begin{align*} \E \Big(\dfrac1{X+Y}\Big) &= \int_{\frac12}^{\infty} t f_{(X+Y)^{-1}}(t) \, \d t \\ &= \int_{\frac12}^{1} t f_{(X+Y)^{-1}}(t) \, \d t + \int_{1}^{\infty} t f_{(X+Y)^{-1}}(t) \d t\\ &= \int_{\frac12}^{1} \l 2t^{-1} - t^{-2} \r \, \d t + \int_{1}^{\infty} t^{-2} \d t\\ &= \left [ 2 \ln (t) + t^{-1} \right]_{\frac12}^{1} + \left [ -t^{-1} \right ]_{1}^{\infty} \\ &= 1 + 2 \ln 2 -2 + 1 \\ &= 2 \ln 2 \end{align*}
  3. \begin{align*} &&\mathbb{P} \l \frac{Y}{X} \leq c \r &= \mathbb{P}( Y \leq c X) \\ &&&= \begin{cases} 0 & \text{if } c \leq 0 \\ \frac{c}{2} & \text{if } 0 \leq c \leq 1 \\ 1-\frac{1}{2c} & \text{if } 1 \leq c \end{cases} \\ \\ \Rightarrow && \mathbb{P} \l \frac{X}{X+Y} \leq t\r &= \mathbb{P} \l \frac{1}{1+\frac{Y}{X}} \leq t\r \\ &&&= \mathbb{P} \l \frac{1}{t} \leq 1+\frac{Y}{X}\r \\ &&&= \mathbb{P} \l \frac{1}{t} - 1\leq \frac{Y}{X}\r \\ &&&= 1- \mathbb{P} \l \frac{Y}{X} \leq \frac{1}{t} - 1\r \\ &&&= 1 - \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t} - \frac{1}{2} & \text{if } 0 \leq \frac1{t} \leq 1 \\ 1-\frac{t}{2-2t} & \text{if } 1 \leq \frac1{t} \end{cases} \\ && f_{\frac{X}{X+Y}}(t) &= \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t^2} & \text{if } t \geq 1 \\ \frac{1}{2(1-t)^2} & \text{if } 0 \leq t \leq 1 \end{cases} \\ \Rightarrow && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^\infty t f(t) \d t \\ &&&= \int_0^1 \frac{1}{2(1-t)^2} \d t + \int_1^\infty \frac{1}{t^2} \d t \\ &&& = \frac{1}{4} + \frac{1}{4} = \frac{1}{2} \\ \\ && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^1 \int_0^1 \frac{x}{x+y} \d y\d x \\ &&&= \int_0^1 \l x \ln (x+1) - x \ln x \r \d x \\ &&&= \left [\frac{x^2}2 \ln(x+1) - \frac{x^2}{2} \ln(x) \right]_0^1 -\int_0^1 \l \frac{x^2}{2(x+1)} - \frac{x}{2} \r \d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x^2-1+1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x -1}{2} + \frac{1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \frac{1}{4} + \frac{1}{2} - \frac{\ln 2}{2} \\ &&&= \frac{1}{2} \end{align*} We can also notice that \(1 = \mathbb{E} \l \frac{X+Y}{X+Y} \r = \mathbb{E} \l \frac{X}{X+Y} \r + \mathbb{E} \l \frac{Y}{X+Y} \r = 2 \mathbb{E} \l \frac{X}{X+Y} \r\) so it's clearly true as long as we can show that the integral converges.

2014 Paper 2 Q12
D: 1600.0 B: 1484.8

The lifetime of a fly (measured in hours) is given by the continuous random variable \(T\) with probability density function \(f(t)\) and cumulative distribution function \(F(t)\). The hazard function, \(h(t)\), is defined, for \(F(t) < 1\), by \[ h(t) = \frac{f(t)}{1-F(t)}\,. \]

  1. Given that the fly lives to at least time \(t\), show that the probability of its dying within the following \(\delta t\) is approximately \(h (t) \, \delta t\) for small values of \(\delta t\).
  2. Find the hazard function in the case \(F(t) = t/a\) for \(0< t < a\). Sketch \(f(t)\) and \(h(t)\) in this case.
  3. The random variable \(T\) is distributed on the interval \(t > a\), where \(a>0\), and its hazard function is \(t^{-1}\). Determine the probability density function for \(T\).
  4. Show that \(h(t)\) is constant for \(t > b\) and zero otherwise if and only if \(f(t) =ke^{-k(t-b)}\) for \(t > b\), where \(k\) is a positive constant.
  5. The random variable \(T\) is distributed on the interval \(t > 0\) and its hazard function is given by \[ h(t) = \left(\frac{\lambda}{\theta^\lambda}\right)t^{\lambda-1}\,, \] where \(\lambda\) and \(\theta\) are positive constants. Find the probability density function for \(T\).


Solution:

  1. \(\,\) \begin{align*} && \mathbb{P}(T > t + \delta t | T > t) &= \frac{\mathbb{P}(T < t + \delta t)}{\mathbb{P}(T > t )} \\ &&&= \frac{\int_t^{t+\delta t} f(s) \d s}{1-F(t)} \\ &&&\approx \frac{f(t)\delta t}{1-F(t)} \\ &&&= h(t) \delta t \end{align*}
  2. If \(F(t) = t/a\) then \(f(t) = 1/a\) and \(h(t) = \frac{1/a}{1-t/a} = \frac{1}{a-t}\)
    TikZ diagram
  3. \(\,\) \begin{align*} && \frac{F'}{1-F} &= \frac{1}{t} \\ \Rightarrow && -\ln (1-F) &= \ln t + C\\ \Rightarrow && 1-F &= \frac{A}{t} \\ && F &= 1 - \frac{A}{t} \\ F(a) = 0: && F &= 1 - \frac{a}{t} \\ && f(t) &= \frac{a}{t^2} \end{align*}
  4. (\(\Rightarrow\)) \begin{align*} && \frac{F'}{1-F} &= k \\ \Rightarrow && -\ln(1-F) &= kt+C \\ \Rightarrow && 1-F &= Ae^{-kt} \\ F(b) = 0: && 1 &= Ae^{-kb} \\ \Rightarrow && 1-F &= e^{-k(t-b)}\\ \Rightarrow && f &= ke^{-k(t-b)} \\ \end{align*} (\(\Leftarrow\)) \(f(t) = ke^{-k(t-b)} \Rightarrow F(t) = 1-e^{-k(t-b)}\) and the result is clear.
  5. \(\,\) \begin{align*} && \frac{F'}{1-F} &= \left ( \frac{\lambda}{\theta^{\lambda}} \right) t^{\lambda-1} \\ \Rightarrow && -\ln(1-F) &= \left ( \frac{t}{\theta} \right)^{\lambda} +C\\ \Rightarrow && F &= 1-A\exp \left (- \left ( \frac{t}{\theta} \right)^{\lambda} \right) \\ F(0) = 0: && 0 &= 1-A \\ \Rightarrow && F &= 1 - \exp \left (- \left ( \frac{t}{\theta} \right)^{\lambda} \right) \\ \Rightarrow && f &= \lambda t^{\lambda -1} \theta^{-\lambda} \exp \left (- \left ( \frac{t}{\theta} \right)^{\lambda} \right) \end{align*}

2014 Paper 3 Q12
D: 1700.0 B: 1500.0

The random variable \(X\) has probability density function \(f(x)\) (which you may assume is differentiable) and cumulative distribution function \(F(x)\) where \(-\infty < x < \infty \). The random variable \(Y\) is defined by \(Y= \e^X\). You may assume throughout this question that \(X\) and \(Y\) have unique modes.

  1. Find the median value \(y_m\) of \(Y\) in terms of the median value \(x_m\) of \(X\).
  2. Show that the probability density function of \(Y\) is \(f(\ln y)/y\), and deduce that the mode \(\lambda\) of \(Y\) satisfies \(\f'(\ln \lambda) = \f(\ln \lambda)\).
  3. Suppose now that \(X \sim {\rm N} (\mu,\sigma^2)\), so that \[ f(x) = \frac{1}{\sigma \sqrt{2\pi}\,} \e^{-(x-\mu)^2/(2\sigma^2)} \,. \] Explain why \[\frac{1}{\sigma \sqrt{2\pi}\,} \int_{-\infty}^{\infty}\e^{-(x-\mu-\sigma^2)^2/(2\sigma^2)} \d x = 1 \] and hence show that \( \E(Y) = \e ^{\mu+\frac12\sigma^2}\).
  4. Show that, when \(X \sim {\rm N} (\mu,\sigma^2)\), \[ \lambda < y_m < \E(Y)\,. \]


Solution:

  1. \begin{align*} && \frac12 &= \mathbb{P}(X \leq x_m) \\ \Leftrightarrow && \frac12 &= \mathbb{P}(e^X \leq e^{x_m} = y_m) \end{align*} Therefore the median is \(y_m = e^{x_m}\)
  2. \begin{align*} && \mathbb{P}(Y \leq y) &= \mathbb{P}(e^X \leq y) \\ &&&= \mathbb{P}(X \leq \ln y) \\ &&&= F(\ln y) \\ \Rightarrow && f_Y(y) &= f(\ln y)/y \\ \\ && f'_Y(y) &= \frac{f'(\ln y) - f(\ln y)}{y^2} \end{align*} Therefore since the mode satisfies \(f'_Y = 0\) we must have \(f'(\ln \lambda ) = f(\ln \lambda)\)
  3. This is the integral of the pdf of \(N(\mu + \sigma^2, \sigma^2)\) and therefore is clearly \(1\). \begin{align*} && \E[Y] &= \int_{-\infty}^{\infty} e^x \cdot \frac{1}{\sqrt{2\pi \sigma^2}} e^{-(x-\mu)^2/(2\sigma^2)} \d x \\ &&&= \frac{1}{\sqrt{2\pi \sigma^2}} \int_{-\infty}^{\infty} \exp (x - (x-\mu)^2/(2\sigma^2)) \d x\\ &&&= \frac{1}{\sqrt{2\pi \sigma^2}} \int_{-\infty}^{\infty} \exp ((2x \sigma^2- (x-\mu)^2)/(2\sigma^2)) \d x\\ &&&= \frac{1}{\sqrt{2\pi \sigma^2}} \int_{-\infty}^{\infty} \exp (-(x-\mu-\sigma^2)^2+2\mu \sigma^2-\sigma^4)/(2\sigma^2)) \d x\\ &&&= \frac{1}{\sqrt{2\pi \sigma^2}} \int_{-\infty}^{\infty} \exp (-(x-\mu+\sigma^2)^2)/(2\sigma^2)+\mu +\frac12\sigma^2) \d x\\ &&&= \e^{\mu +\frac12\sigma^2}\frac{1}{\sqrt{2\pi \sigma^2}} \int_{-\infty}^{\infty} \exp (-(x-\mu-\sigma^2)^2)/(2\sigma^2)) \d x\\ &&&= \e^{\mu +\frac12\sigma^2} \end{align*}
  4. Notice that \(y_m = e^\mu < e^{\mu + \tfrac12 \sigma^2} = \E[Y]\), so it suffices to prove that \(\lambda < e^{\mu}\) Notice that \(f'(x) - f(x) = f(x)[-(x-\mu)/\sigma^2 - 1]\) and therefore \(\ln y - \mu = -\sigma^2\) so \(\lambda = e^{\mu - \sigma^2}\) which is clearly less than \(e^{\mu}\) as required.

2013 Paper 3 Q13
D: 1700.0 B: 1484.0

  1. The continuous random variable \(X\) satisfies \(0\le X\le 1\), and has probability density function \(\f(x)\) and cumulative distribution function \(\F(x)\). The greatest value of \(\f(x)\) is \(M\), so that \(0\le \f(x) \le M\).
    1. Show that \(0\le \F(x) \le Mx\) for \(0\le x\le1\).
    2. For any function \(\g(x)\), show that \[ \int_0^1 2 \g(x) \F(x) \f(x) \d x = \g(1) - \int_0^1 \g'(x) \big( \F(x)\big)^2 \d x \,. \]
  2. The continuous random variable \(Y\) satisfies \(0\le Y\le 1\), and has probability density function \(k \F(y) \f(y)\), where \(\f\) and \(\F\) are as above.
    1. Determine the value of the constant \(k\).
    2. Show that \[ 1+ \frac{nM}{n+1}\mu_{n+1} - \frac{nM}{n+1} \le \E(Y^n) \le 2M\mu_{n+1}\,, \] where \(\mu_{n+1} = \E(X^{n+1})\) and \(n\ge0\).
    3. Hence show that, for \(n\ge 1\), \[ \mu _n \ge \frac{n}{(n+1)M} -\frac{n-1}{n+1} \,.\]


Solution:

    1. \(\,\) \begin{align*} && 0 &\leq f(t) &\leq M \\ \Rightarrow && \int_0^x 0 \d t &\leq \int_0^x f(t) \d t & \leq \int_0^x M \d x \\ \Rightarrow && 0 &\leq F(x) &\leq Mx \end{align*}
    2. \(\,\) \begin{align*} && \int_0^1 2g(x)F(x)f(x) \d x &= \left [ g(x) F(x)^2 \right] - \int_0^1 g'(x) \left ( F(x)\right)^2 \d x \\ &&&= g(1) - \int_0^1 g'(x) \left ( F(x)\right)^2 \d x \end{align*}
    1. \(\,\) \begin{align*} && 1 &= \int_0^1 kF(y)f(y) \d y \\ &&&= k\left [ \frac12 F(y)^2\right]_0^1 \\ &&&= \frac{k}{2} \\ \Rightarrow && k &= 2 \end{align*}
    2. \(\,\) \begin{align*} \E[Y^n] &= \int_0^1 y^n 2F(y)f(y) \d y \\ &\geq \int_0^1 y^n 2My f(y) \d y \\ &= 2M\int_0^1 y^{n+1} f(y) \d y \\ &= 2M \E[X^{n+1}] = 2M\mu_{n+1} \\ \\ \E[Y^n] &= \int_0^1 y^n 2F(y)f(y) \d y \\ &= 1 - \int_0^1 ny^{n-1} F(y)^2 \d y \\ &\geq 1 - \int_0^1 ny^{n-1}My F(y) \d y \\ &= 1 - M\int_0^1 ny^n F(y) \d y \\ &= 1 - M[\frac{n}{n+1}y^{n+1} F(y)]_0^1 + M\int_0^1\frac{n}{n+1} y^{n+1} f(y) \d y \\ &= 1 - \frac{nM}{n+1} + \frac{nM}{n+1} \mu_{n+1} \end{align*}
    3. Since \(\E[Y^{n-1}] \geq 0\) we must have \begin{align*} && 2M\mu_n \geq 1 + \frac{(n-1)M}{n}\mu_n - \frac{(n-1)M}{n} \\ \Rightarrow && \mu_n \left (2M + \frac{(n-1)M}{n} \right) \geq 1 - \frac{(n-1)M}{n} \\ \Rightarrow && \mu_n \frac{3Mn-M}{n} & \geq \frac{n-(n-1)M}{n} \\ \Rightarrow && \mu_n & \geq \frac{n-(n-1)M}{3Mn-M} \end{align*}

2012 Paper 3 Q13
D: 1700.0 B: 1484.0

  1. The random variable \(Z\) has a Normal distribution with mean \(0\) and variance \(1\). Show that the expectation of \(Z\) given that \(a < Z < b\) is \[ \frac{\exp(- \frac12 a^2) - \exp(- \frac12 b^2) } {\sqrt{2\pi\,} \,\big(\Phi(b) - \Phi(a)\big)}, \] where \(\Phi\) denotes the cumulative distribution function for \(Z\).
  2. The random variable \(X\) has a Normal distribution with mean \(\mu\) and variance \(\sigma^2\). Show that \[ \E(X \,\vert\, X>0) = \mu + \sigma \E(Z \,\vert\,Z > -\mu/\sigma). \] Hence, or otherwise, show that the expectation, \(m\), of \(\vert X\vert \) is given by \[ m= \mu \big(1 - 2 \Phi(- \mu / \sigma)\big) + \sigma \sqrt{2 / \pi}\; \exp(- \tfrac12 \mu^2 / \sigma^2) \,. \] Obtain an expression for the variance of \(\vert X \vert\) in terms of \(\mu \), \(\sigma \) and \(m\).


Solution:

  1. \(\,\) \begin{align*} && \mathbb{E}(Z| a < Z < b) &= \mathbb{E}(Z\mathbb{1}_{(a,b)}) /\mathbb{E}(\mathbb{1}_{(a,b)}) \\ &&&= \int_a^b z \phi(z) \d z \Big / (\Phi(b) - \Phi(a)) \\ &&&= \frac{\int_a^b \frac{1}{\sqrt{2 \pi}}z e^{-\frac12 z^2} \d z}{\Phi(b) - \Phi(a)} \\ &&&= \frac{\frac1{\sqrt{2\pi}} \left [-e^{-\frac12 z^2} \right]_a^b}{\Phi(b) - \Phi(a)} \\ &&&= \frac{\frac1{\sqrt{2\pi}} \left (e^{-\frac12 a^2}-e^{-\frac12 b^2} \right)}{\Phi(b) - \Phi(a)} \\ \end{align*}
  2. \(\,\) \begin{align*} && \mathbb{E}(X |X > 0) &= \mathbb{E}(\mu + \sigma Z | \mu + \sigma Z > 0) \\ &&&= \mathbb{E}(\mu + \sigma Z | Z > -\tfrac{\mu}{\sigma}) \\ &&&= \mathbb{E}(\mu| Z > -\tfrac{\mu}{\sigma})+ \sigma \mathbb{E}(Z | Z > -\tfrac{\mu}{\sigma})\\ &&&= \mu+ \sigma \mathbb{E}(Z | Z > -\tfrac{\mu}{\sigma})\\ \end{align*} Hence \begin{align*} &&\mathbb{E}(|X|) &= \mathbb{E}(X | X > 0)\mathbb{P}(X > 0) - \mathbb{E}(X | X < 0)\mathbb{P}(X < 0) \\ &&&=\left ( \mu+ \sigma \mathbb{E}(Z | Z > -\mu /\sigma)\right)(1-\Phi(-\mu/\sigma)) - \left ( \mu+ \sigma \mathbb{E}(Z | Z < -\mu /\sigma)\right)\Phi(-\mu/\sigma) \\ &&&= \mu(1 - 2\Phi(-\mu/\sigma)) + \sigma \frac{e^{-\frac12\mu^2/\sigma^2}}{\sqrt{2\pi}(1-\Phi(-\mu/\sigma))}(1-\Phi(-\mu/\sigma)) + \sigma \frac{e^{-\frac12\mu^2/\sigma^2}}{\sqrt{2 \pi} \Phi(-\mu/\sigma)} \Phi(-\mu/\sigma) \\ &&&= \mu(1 - 2\Phi(-\mu/\sigma)) + \sigma \sqrt{\frac{2}{\pi}} \exp(-\tfrac12 \mu^2/\sigma^2) \end{align*} Finally, \begin{align*} && \textrm{Var}(|X|) &= \mathbb{E}(|X|^2) - [\mathbb{E}(|X|)]^2 \\ &&&= \mu^2 + \sigma^2 - m^2 \end{align*}

2005 Paper 2 Q14
D: 1600.0 B: 1469.5

The probability density function \(\f(x)\) of the random variable \(X\) is given by $$\f(x) = k\left[{\phi}(x) + {\lambda}\g(x)\right]$$ where \({\phi}(x)\) is the probability density function of a normal variate with mean 0 and variance 1, \(\lambda \) is a positive constant, and \(\g(x)\) is a probability density function defined by \[ \g(x)= \begin{cases} 1/\lambda & \mbox{for \(0 \le x \le {\lambda}\)}\,;\\ 0& \mbox{otherwise} . \end{cases} \] Find \(\mu\), the mean of \(X\), in terms of \(\lambda\), and prove that \(\sigma\), the standard deviation of \(X\), satisfies. $$\sigma^2 = \frac{\lambda^4 +4{\lambda}^3+12{\lambda}+12} {12(1 + \lambda )^2}\;.$$ In the case \(\lambda=2\):

  1. draw a sketch of the curve \(y=\f(x)\);
  2. express the cumulative distribution function of \(X\) in terms of \(\Phi(x)\), the cumulative distribution function corresponding to \(\phi(x)\);
  3. evaluate \(\P(0 < X < \mu+2\sigma)\), given that \(\Phi (\frac 23 + \frac23 \surd7)=0.9921\).


Solution: \begin{align*} && 1 &= \int_{-\infty}^{\infty} f(x) \d x \\ &&&= k[1 + \lambda] \\ \Rightarrow && k &= \frac{1}{1+\lambda} \\ \\ && \mu &= \int_{-\infty}^\infty x f(x) \d x \\ &&&= k \int_{-\infty}^\infty x \phi(x) \d x + k \lambda \int_{-\infty}^{\infty} x g(x) \d x \\ &&&= k \cdot 0 + k \lambda \cdot \frac{\lambda}{2} \\ &&&= \frac{\lambda^2}{2(1+\lambda)} \\ \\ && \E[X^2] &= \int_{-\infty}^\infty x^2 f(x) \d x \\ &&&= k \int_{-\infty}^\infty x^2 \phi(x) \d x + k \lambda \int_{-\infty}^{\infty} x^2 g(x) \d x \\ &&&= k \cdot 1 + k \lambda \int_0^{\lambda} \frac{x^2}{\lambda} \d \lambda \\ &&&= k + \frac{k \lambda^3}{3} \\ &&&= \frac{3+\lambda^3}{3(1+\lambda)} \\ && \var[X] &= \frac{3+\lambda^3}{3(1+\lambda)} - \frac{\lambda^4}{4(1+\lambda)^2} \\ &&& = \frac{(3+\lambda^3)4(1+\lambda) - 3\lambda^4}{12(1+\lambda)^2} \\ &&&= \frac{\lambda^4+4\lambda^3+12\lambda + 12}{12(1+\lambda)^2} \end{align*}

  1. \(\,\)
    TikZ diagram
  2. \(\,\) \begin{align*} && \mathbb{P}(X \leq x) &= \int_{-\infty}^x f(x) \d x \\ &&&= \begin{cases} \frac13 \Phi(x) & \text{if } x < 0 \\ \frac13\Phi(x) + \frac13x & \text{if } 0 \leq x \leq 2 \\ \frac13 \Phi(x) + \frac23 & \text{if } 2 < x \end{cases} \end{align*} When \(\lambda = 2\), \(\mu = \frac{4}{6} = \frac23\), \(\sigma^2 = \frac{16+32+24+12}{12 \cdot 9} = \frac{7}{9}\), so \(\mu + 2 \sigma = \frac23 + \frac{2\sqrt7}{3}>2\). Therefore \begin{align*} && \P(0 < X < \mu + 2\sigma) &= \frac13 \Phi\left (\frac{2+2\sqrt{7}}{3} \right) + \frac23 - \Phi(0) \\ &&&= \tfrac13 \cdot 0.9921 +\tfrac23 - \tfrac12 \\ &&&= 0.4974 \end{align*}

2004 Paper 2 Q12
D: 1600.0 B: 1516.0

Sketch the graph, for \(x \ge 0\,\), of $$ y = kx\e^{-ax^2} \;, $$ where \(a\) and \(k\) are positive constants. The random variable \(X\) has probability density function \(\f(x)\) given by \begin{equation*} \f(x)= \begin{cases} kx\e^{-ax^2} & \text{for \(0 \le x \le 1\)}\\[3pt] 0 & \text{otherwise}. \end{cases} \end{equation*} Show that \(\displaystyle k=\frac{2a}{1-\e^{-a}}\) and find the mode \(m\) in terms of \(a\,\), distinguishing between the cases \(a < \frac12\) and \(a > \frac12\,\). Find the median \(h\) in terms of \(a\), and show that \(h > m\) if \(a > -\ln\left(2\e^{-1/2} - 1\right).\) Show that, \(-\ln\left(2\e^{-1/2}-1\right)> \frac12 \,\). Show also that, if \(a > -\ln\left(2\e^{-1/2} - 1\right) \,\), then $$ P(X > m \;\vert\; X < h) = {{2\e^{-1/2}-\e^{-a}-1} \over 1-\e^{-a}}\;. $$


Solution:

TikZ diagram
\begin{align*} && 1 &= \int_0^1 f(x) \d x \\ &&&= \int_0^1 kx e^{-ax^2} \d x \\ &&&= \left [-\frac{k}{2a}e^{-ax^2} \right]_0^1 \\ &&&= \frac{k(1-e^{-a})}{2a} \\ \Rightarrow && k &= \frac{2a}{1-e^{-a}} \end{align*} To find the mode, we want \(f'(x) = 0\), ie \begin{align*} && 0 &= f'(x) \\ &&&= -2kax^2e^{-ax^2} + k e^{-ax^2} \\ &&&= ke^{-ax^2} \left (1-2ax^2 \right)\\ \end{align*} So either \(m = \frac{1}{\sqrt{2a}}\) (if \(a > \frac12\)) or \(f(x)\) is increasing and the mode is \(m = 1\) (if \(a < \frac12\)). \begin{align*} && \frac12 &= \int_0^h f(x) \d x \\ &&&= \left [ -\frac{e^{-ax^2}}{1-e^{-a}} \right]_0^h \\ &&&= \frac{1-e^{-ah^2}}{1-e^{-a}} \\ \Rightarrow && e^{-ah^2}&= 1-\frac12(1-e^{-a}) \\ \Rightarrow && -a h^2 &= \ln \left ( \frac12(1+e^{-a}) \right) \\ \Rightarrow && h &= \sqrt{-\frac1a \ln (\tfrac12(1+e^{-a}))} \end{align*} \(h > m\) already means \(a > \frac12\) so \begin{align*} && h &> m \\ \Leftrightarrow &&\sqrt{-\frac1a \ln (\tfrac12(1+e^{-a}))} &> \frac{1}{\sqrt{2a}} \\ \Leftrightarrow && -\ln (\tfrac12(1+e^{-a})) &> \frac12 \\ \Leftrightarrow && e^{-1/2} & > \frac12(1+e^{-a}) \\ \Leftrightarrow && 2e^{-1/2}-1 &>e^{-a} \\ \Leftrightarrow && \ln(2e^{-1/2}-1) &>-a \\ \Leftrightarrow && a& > -\ln(2e^{-1/2}-1) \\ \end{align*} Noting that \begin{align*} && -\ln(2e^{-1/2} - 1) &= -\ln \left (\frac{2-\sqrt{e}}{e^{1/2}} \right) \\ &&&= \frac12 -\ln(\underbrace{2 - \sqrt{e}}_{<1}) \\ &&&> \frac12 \end{align*} If \(a > -\ln(2e^{-1/2}-1)\) then \begin{align*} && \mathbb{P}(X > m | X < h) &= \frac{\mathbb{P}(m < X < h)}{\mathbb{P}(X < h)} \\ &&&= \frac{e^{-am^2}-e^{-ah^2}}{1-e^{-ah^2}} \\ &&&= \frac{e^{-a\frac{1}{2a}}-e^{\ln \left ( \frac12(1+e^{-a}) \right)}}{1-e^{\ln \left ( \frac12(1+e^{-a}) \right)}} \\ &&&= \frac{e^{-1/2}-\frac12(1+e^{-a})}{1-\frac12(1+e^{-a})} \\ &&&= \frac{2e^{-1/2}-1-e^{-a}}{1-e^{-a}} \end{align*} as required.

2002 Paper 2 Q12
D: 1600.0 B: 1500.6

On \(K\) consecutive days each of \(L\) identical coins is thrown \(M\) times. For each coin, the probability of throwing a head in any one throw is \(p\) (where \(0 < p < 1\)). Show that the probability that on exactly \(k\) of these days more than \(l\) of the coins will each produce fewer than \(m\) heads can be approximated by \[ {K \choose k}q^k(1-q)^{K-k}, \] where \[ q=\Phi\left( \frac{2h-2l-1}{2\sqrt{h} }\right), \ \ \ \ \ \ h=L\Phi\left( \frac{2m-1-2Mp}{2\sqrt{ Mp(1-p)}}\right) \] and \(\Phi(\cdot)\) is the cumulative distribution function of a standard normal variate. Would you expect this approximation to be accurate in the case \(K=7\), \(k=2\), \(L=500\), \(l=4\), \(M=100\), \(m=48\) and \(p=0.6\;\)?


Solution: Let \(H_i\) be the random variable of how many heads the \(i\)th coin throws on a given day. Then \(H_i \sim B(M,p)\), and the probability that a given coin produces fewer than \(m\) heads is \(p_h = \P(H_i < m)\) Let \(C\) be the random variable the number of coins producing fewer than \(m\) heads, then \(C \sim B(L, p_h)\). The probability that more than \(l\) of the coins produce fewer than \(m\) heads is therefore \(\P(C > l)\). Finally, the probability that on exactly \(k\) days more than \(l\) of the coins will produce fewer than \(m\) heads is: \[ \binom{K}{k} \cdot \P(C > l)^k \cdot (1-\P(C > l))^{K-k} \] Let's start by assuming that all our Binomials can be approximated by a normal distribution. \(B(M,p) \approx N(Mp, Mp(1-p))\) and so: \begin{align*} p_h &= \P(H_i < m) \\ &\approx \P( \sqrt{Mp(1-p)}Z+Mp < m-\frac12) \\ &= \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \\ &= \Phi\l\frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \end{align*} \(B(L, p_h) \approx B \l L, \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r\r = B(L, \frac{h}{L}) \approx N(h, \frac{h(L-h)}{L})\) Therefore \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1- \P \l \sqrt{\frac{h(L-h)}{L}} Z + h \leq l+\frac12 \r \\ &= 1 - \P \l Z \leq \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}}\r \\ &= 1- \Phi\l \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}} \r \\ &= \Phi\l \frac{2h-2l-1}{2\sqrt{\frac{h(L-h)}{L}}} \r \end{align*} If we can approximate \(\sqrt{1-\frac{h}{L}}\) by \(1\) then we obtain the approximation in the question. Alternatively, \(B(L, \frac{h}{L}) \approx Po(h)\) and \(Po(h) \approx N(h,h)\) so we obtain: \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1 - \P(\sqrt{h} Z +h < l + \frac12) \\ &= 1 - \P \l Z < \frac{2l-2h+1}{2\sqrt{h}} \r \\ &= \Phi \l \frac{2h - 2l -1}{2\sqrt{h}}\r \end{align*} as required. [I think this is what the examiners expected]. Considering the case \(K=7\), \(k=2\), \(L=500\), \(l=4\), \(M=100\), \(m=48\) and \(p=0.6\), we have the first normal approximation depends on \(Mp\) and \(M(1-p)\) being large. They are \(60\) and \(40\) respectively, so this is likely a good approximation. The first approximation finds that \begin{align*} h &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{-25}{2 \sqrt{24}} \r \\ &\approx 500 \cdot \Phi (-2.5) \\ &= 500 \cdot 0.0062 \\ &\approx 3.1 \end{align*} The second binomial approximation will be good if \(500 \cdot \frac{3.1}{500} = 3.1\) is large, but this is quite small. Therefore, we shouldn't expect this to be a good approximation. However, since \(m = 48\) is far from the mean (in a normalised sense), we might expect the percentage error to be large. [Alternatively, using what I expect the desired approach] The approximation of \(B(L, \frac{h}{L}) \approx Po(h)\) is acceptable since \(n>50\) and \(h < 5\). The approximation of \(Po(h) \sim N(h,h)\) is not acceptable since \(h\) is small (in particular \(h < 15\)) Finally, we can compute all these values exactly using a modern calculator. \begin{array}{l|cc} & \text{correct} & \text{approx} \\ \hline p_h & 0.005760\ldots & 0.005362\ldots \\ \P(C > l) & 0.164522\ldots & 0.133319\ldots \\ \text{ans} & 0.231389\ldots & 0.182516\ldots \end{array} We can also see how the errors propagate, by doing the calculations assuming the previous steps are correct, and also including the Poisson step. \begin{array}{lccc} & \text{correct} & \text{approx} & \text{using approx } p_h \\ \hline p_h & 0.005760\ldots & 0.005362\ldots & - \\ \P(C > l)\quad [Po(h)] & 0.164522\ldots & 0.165044\ldots & 0.134293\ldots \\ \P(C > l)\quad [N(h,h)] & 0.164522\ldots & 0.169953\ldots & 0.133319\ldots \\ \P(C > l)\quad [N(h,h(1-\frac{h}{L})] & 0.164522\ldots & 0.169255\ldots & 0.132677\ldots \\ \text{ans} & 0.231389\ldots & 0.231389\ldots \end{array} By doing this, we discover that the largest errors are actually coming not from approximating the second approximation but from the small absolute (but large relative error) in the first approximation. This is, in fact, a coincidence; we can observe it by investigating the specific values being used. The first approximation looks as follows:

TikZ diagram
You might not be able to tell, but there's actually two plots on this chart. However, let's zoom in on the area we are worried about:
TikZ diagram
We can see there are small differences, which could be large in percentage terms. (As we found when we computed them directly).
TikZ diagram
First, we can immediately see that if we just look at the distribution of \(B(L, p_h)\) and \(B(L, p_{h_\text{approx}})\) we get quite different results, even before we do any approximations.
TikZ diagram
If we plot the probability distribution of \(B(L, p_h)\) vs \(N(Lp_h, Lp_h(1-p_h))\) we find that it is not a great approximation.
TikZ diagram
However, the CDF happens to be a very good approximation *just* for the value we care about. Very lucky, but not possible for someone sitting STEP to know at the time!

2002 Paper 2 Q13
D: 1600.0 B: 1484.0

Let \(\F(x)\) be the cumulative distribution function of a random variable \(X\), which satisfies \(\F(a)=0\) and \(\F(b)=1\), where \(a>0\). Let \[ \G(y) = \frac{\F(y)}{2-\F(y)}\;. \] Show that \(\G(a)=0\,\), \(\G(b)=1\,\) and that \(\G'(y)\ge0\,\). Show also that \[ \frac12 \le \frac2{(2-\F(y))^2} \le 2\;. \] The random variable \(Y\) has cumulative distribution function \(\G(y)\,\). Show that \[ { \tfrac12} \,\E(X) \le \E(Y) \le 2 \E(X) \;, \] and that \[ \var(Y) \le 2\var(X) +\tfrac 74 \big(\E(X)\big)^2\;. \]


Solution: \begin{align*} && G(a) &= \frac{F(a)}{2-F(a)}\\ &&&= 0 \tag{\(F(a)= 0\)}\\ \\ && G(b) &= \frac{F(b)}{2-F(b)} \\ &&&= \frac{1}{2-1} = 1 \tag{\(F(b)=1\)}\\ \\ && G'(y) &= \frac{F'(y)(2-F(y))+F(y)F'(y)}{(2-F(y))^2} \\ &&&= \frac{2F'(y)}{(2-F(y))^2} \geq 0 \tag{\(F'(y) \geq 0\)} \end{align*} \begin{align*} && 0 \leq F(y)\leq1\\ \Leftrightarrow&& 1\leq 2-F(y) \leq 2\\ \Leftrightarrow &&1 \leq (2-F(y))^2 \leq 4\\ \Leftrightarrow && 1 \geq \frac{1}{(2-F(y))^2} \geq \frac14 \\ \Leftrightarrow && 2 \geq \frac{2}{(2-F(y))^2} \geq\frac12 \end{align*} \begin{align*} && \mathbb{E}(Y) &= \int_a^b y G'(y) \d y \\ &&&= \int_a^b y F'(y) \underbrace{\frac{2}{(2-F(y))^2}}_{\in [\frac12, 2]} \d y \\ &&&\leq 2 \E[X] \\ &&&\geq \frac12 \E[X]\\ \\ && \E[Y^2] &\leq 2\E[X^2] \\ && \E[Y^2] &\geq \frac12\E[X^2] \\ \\ \Rightarrow && \var[Y] &= \E[Y^2]-\E[Y]^2 \\ &&& \leq 2 \E[X^2] - (\tfrac12\E[X])^2 \\ &&&= 2 \var[X] + \tfrac74(\E[X])^2 \end{align*}

1997 Paper 2 Q13
D: 1600.0 B: 1516.0

A needle of length two cm is dropped at random onto a large piece of paper ruled with parallel lines two cm apart.

  1. By considering the angle which the needle makes with the lines, find the probability that the needle crosses the nearest line given that its centre is \(x\) cm from it, where \(0 < x < 1\).
  2. Given that the centre of the needle is \(x\) cm from the nearest line and that the needle crosses that line, find the cumulative distribution function for the length of the shorter segment of the needle cut off by the line.
  3. Find the probability that the needle misses all the lines.


Solution:

  1. Suppose the needle's center is \(x\) cm from the nearest line and makes an angle of \(\theta\). Then if \(\sin \theta > x\) it will cross the line, otherwise it will not. Given that \(\theta \sim U(0, \frac{\pi}{2})\), we can see that \begin{align*} && \mathbb{P}(\text{needle crosses}) &= \mathbb{P}(\sin \theta > x) \\ &&&= \mathbb{P}(\theta > \sin^{-1} x) \\ &&&= 1-\frac{2\sin^{-1} x}{\pi} \end{align*}
  2. The length of the short segment is \(L = 1 - \frac{x}{\sin \theta}\) and \(\theta \sim U(\sin^{-1} x, \frac{\pi}{2})\). So \begin{align*} && F_L(l) &= \mathbb{P}(L < l) \\ &&&= \mathbb{P}\left (1 - \frac{x} {\sin \theta} < l\right) \\ &&&= \mathbb{P}\left ( \sin \theta < \frac{x}{1-l}\right) \\ &&&= \mathbb{P}\left (\theta < \sin^{-1} \frac{x}{1-l}\right) \\ &&&= \frac{ \sin^{-1} \frac{x}{1-l} - \sin^{-1} x }{\frac{\pi}{2} - \sin^{-1}x} \end{align*}
  3. The needle (with probability \(1\)) cannot hit \(2\) lines, so let's only consider the line it's nearest too. The distance to this line is uniform on \([0,1]\), and the so we want to calculate. \begin{align*} && \mathbb{P}(\text{needle crosses}) &= \int_0^1 \left (1 - \frac{2\sin^{-1}x}{\pi} \right) \d x \\ &&&= 1 - \frac{2}{\pi} \int_0^1 \sin^{-1} x \d x\\ &&&= 1 - \frac{2}{\pi} \left ( \frac{\pi}{2} - 1 \right) \\ &&&= \frac{2}{\pi} \end{align*} Therefore the probability it misses is \(1 - \frac{\pi}{2}\)

1992 Paper 3 Q15
D: 1700.0 B: 1500.0

A goat \(G\) lies in a square field \(OABC\) of side \(a\). It wanders randomly round its field, so that at any time the probability of its being in any given region is proportional to the area of this region. Write down the probability that its distance, \(R\), from \(O\) is less than \(r\) if \(0 < r\leqslant a,\) and show that if \(r\geqslant a\) the probability is \[ \left(\frac{r^{2}}{a^{2}}-1\right)^{\frac{1}{2}}+\frac{\pi r^{2}}{4a^{2}}-\frac{r^{2}}{a^{2}}\cos^{-1}\left(\frac{a}{r}\right). \] Find the median of \(R\) and probability density function of \(R\). The goat is then tethered to the corner \(O\) by a chain of length \(a\). Find the conditional probability that its distance from the fence \(OC\) is more than \(a/2\).