Problems

2025 Paper 3 Q11

D: 1500.0 B: 1500.0

cumulative distribution function exponential distribution order statistics probability density function logarithmic approximation median confidence interval independent random variables

Let $\lambda > 0$. The independent random variables $X_1, X_2, \ldots, X_n$ all have probability density function $$f(t) = \begin{cases} \lambda e^{-\lambda t} & t \geq 0 \\ 0 & t < 0 \end{cases}$$ and cumulative distribution function $F(x)$. The value of random variable $Y$ is the largest of the values $X_1, X_2, \ldots, X_n$. Show that the cumulative distribution function of $Y$ is given, for $y \geq 0$, by $$G(y) = (1 - e^{-\lambda y})^n$$
The values $L(\alpha)$ and $U(\alpha)$, where $0 < \alpha \leq \frac{1}{2}$, are such that $$P(Y < L(\alpha)) = \alpha \text{ and } P(Y > U(\alpha)) = \alpha$$ Show that $$L(\alpha) = -\frac{1}{\lambda}\ln(1 - \alpha^{1/n})$$ and write down a similar expression for $U(\alpha)$.
Use the approximation $e^t \approx 1 + t$, for $|t|$ small, to show that, for sufficiently large $n$, $$\lambda L(\alpha) \approx \ln(n) - \ln\left(\ln\left(\frac{1}{\alpha}\right)\right)$$
Hence show that the median of $Y$ tends to infinity as $n$ increases, but that the width of the interval $U(\alpha) - L(\alpha)$ tends to a value which is independent of $n$.
You are given that, for $|t|$ small, $\ln(1 + t) \approx t$ and that $e^3 \approx 20$. Show that, for sufficiently large $n$, there is an interval of width approximately $4\lambda^{-1}$ in which $Y$ lies with probability $0.9$.

Solution:

Note that $\displaystyle F(y) = \mathbb{P}(X_i < y) = \int_0^y \lambda e^{-\lambda t} \d t = 1-e^{-\lambda y}$. Notice also that \begin{align*} G(y) &= \mathbb{P}(Y < y) \\ &= \mathbb{P}(\max_i(X_i) < y) \\ &= \mathbb{P}(X_i < y \text{ for all }i) \\ &= \prod_{i=1}^n \mathbb{P}(X_i < y) \\ &= \prod_{i=1}^n (1-e^{-\lambda y})\\ &= (1-e^{-\lambda y})^n \end{align*} as required.
\begin{align*} && \mathbb{P}(Y < L(\alpha)) &= \alpha \\ \Rightarrow && (1-e^{-\lambda L(\alpha)})^n &= \alpha \\ \Rightarrow && 1-e^{-\lambda L(\alpha)} &= \alpha^{\tfrac1n} \\ \Rightarrow && L(\alpha) &= -\frac{1}{\lambda}\ln \left (1-\alpha^{\tfrac1n} \right) \end{align*} Notice also: \begin{align*} && \mathbb{P}(Y > U(\alpha)) &= \alpha \\ \Rightarrow && 1 - (1-e^{-\lambda U(\alpha)})^n &= \alpha \\ \Rightarrow && U(\alpha) &= -\frac{1}{\lambda}\ln \left ( 1-(1-\alpha)^{\tfrac1n} \right) \end{align*}
\begin{align*} \lambda L(\alpha) &= -\ln \left (1-\alpha^{\tfrac1n} \right) \\ &= -\ln \left (1-e^{\tfrac1n \ln \alpha} \right) \\ &\approx - \ln \left ( 1 - 1 - \frac1n \ln \alpha\right) \tag{$e^t \approx 1 + t$} \\ &= -\ln \left ( \frac{1}{n} \ln \frac{1}\alpha \right) \\ &= - \ln \frac{1}{n} - \ln \left ( \ln \frac{1}{\alpha} \right )\\ &= \ln n - \ln \left ( \ln \left ( \frac{1}{\alpha} \right ) \right) \end{align*} since if $n$ is large, $\frac{\ln \alpha}{n}$ is small.
The median is the value where $\mathbb{P}(Y < M) = \frac12$, or in other words $L(\frac12)$, but this is $\approx \frac{\ln n - \ln (\ln 2)}{\lambda} \to \infty$. \begin{align*} && \lambda U(\alpha) &\approx \ln n - \ln \left ( \ln \left ( \frac{1}{1-\alpha} \right ) \right) \\ \Rightarrow && \lambda(U(\alpha) - L(\alpha)) &\approx -\ln \left ( \ln \left ( \frac{1}{1-\alpha} \right ) \right)+ \ln \left ( \ln \left ( \frac{1}{\alpha} \right ) \right) \\ \Rightarrow && U(\alpha) - L(\alpha) &\to \frac{1}{\lambda} \left ( \ln \left ( \ln \left ( \frac{1}{\alpha} \right ) \right)-\ln \left ( \ln \left ( \frac{1}{1-\alpha} \right ) \right ) \right) \end{align*} which doesn't depend on $n$.
Suppose $\alpha = \frac{1}{20}$ then \begin{align*} U(\alpha) - L(\alpha) &\approx \frac{1}{\lambda} \left (\ln \ln 20 - \ln \ln \frac{20}{19} \right) \\ &= \lambda^{-1} \left (\ln \ln 20 - \ln \ln (1 + \frac{1}{19}) \right) \\ &\approx \lambda^{-1} \left (\ln 3 - \ln \frac{1}{19} \right) \tag{$\ln(1+t) \approx t$} \\ &\approx \lambda^{-1} \ln 3 \cdot 19 \\ &\approx \lambda^{-1} (1 + 3) \\ &\approx 4\lambda^{-1} \end{align*} [Note that $\ln \ln 20 - \ln \ln \frac{20}{19} = 4.0673\ldots$]

View

2024 Paper 3 Q12

D: 1500.0 B: 1500.0

cumulative distribution function geometric probability continuous probability distributions integration expected value trigonometric substitution hyperbolic substitution

A point is chosen at random in the square $0 \leqslant x \leqslant 1$, $0 \leqslant y \leqslant 1$, so that the probability that a point lies in any region is equal to the area of that region. $R$ is the random variable giving the distance of the point from the origin. Show that the cumulative distribution function of $R$ is given by \[\mathrm{P}(R \leqslant r) = \sqrt{r^2 - 1} + \tfrac{1}{4}\pi r^2 - r^2 \cos^{-1}(r^{-1}),\] when $1 \leqslant r \leqslant \sqrt{2}$. What is the cumulative distribution function when $0 \leqslant r \leqslant 1$?
Show that $\displaystyle\mathrm{E}(R) = \frac{2}{3}\int_1^{\sqrt{2}} \frac{r^2}{\sqrt{r^2-1}}\,\mathrm{d}r$.
Show further that $\mathrm{E}(R) = \frac{1}{3}\Bigl(\sqrt{2} + \ln\bigl(\sqrt{2}+1\bigr)\Bigr)$.

View

2023 Paper 2 Q12

D: 1500.0 B: 1500.0

continuous probability distributions order statistics probability density function cumulative distribution function median integration maximum of random variables

Each of the independent random variables $X_1, X_2, \ldots, X_n$ has the probability density function $\mathrm{f}(x) = \frac{1}{2}\sin x$ for $0 \leqslant x \leqslant \pi$ (and zero otherwise). Let $Y$ be the random variable whose value is the maximum of the values of $X_1, X_2, \ldots, X_n$.

Explain why $\mathrm{P}(Y \leqslant t) = \big[\mathrm{P}(X_1 \leqslant t)\big]^n$ and hence, or otherwise, find the probability density function of $Y$.

Let $m(n)$ be the median of $Y$ and $\mu(n)$ be the mean of $Y$.

Find an expression for $m(n)$ in terms of $n$. How does $m(n)$ change as $n$ increases?
Show that \[\mu(n) = \pi - \frac{1}{2^n}\int_0^{\pi} (1-\cos x)^n\,\mathrm{d}x\,.\]
1. Show that $\mu(n)$ increases with $n$.
2. Show that $\mu(2) < m(2)$.

View

2021 Paper 3 Q11

D: 1500.0 B: 1500.0

exponential distribution continuous probability distributions floor function independence probability density function expectation cumulative distribution function

The continuous random variable $X$ has probability density function \[ f(x) = \begin{cases} \lambda e^{-\lambda x} & \text{for } x \geqslant 0, \\ 0 & \text{otherwise,} \end{cases} \] where $\lambda$ is a positive constant. The random variable $Y$ is the greatest integer less than or equal to $X$, and $Z = X - Y$.

Show that, for any non-negative integer $n$, \[ \mathrm{P}(Y = n) = (1 - e^{-\lambda})\,e^{-n\lambda}. \]
Show that \[ \mathrm{P}(Z < z) = \frac{1 - e^{-\lambda z}}{1 - e^{-\lambda}} \qquad \text{for } 0 \leqslant z \leqslant 1. \]
Evaluate $\mathrm{E}(Z)$.
Obtain an expression for \[ \mathrm{P}(Y = n \text{ and } z_1 < Z < z_2), \] where $0 \leqslant z_1 < z_2 \leqslant 1$ and $n$ is a non-negative integer. Determine whether $Y$ and $Z$ are independent.

Solution:

$\,$ \begin{align*} && \mathbb{P}(Y = n) &= \mathbb{P}(X \in [n, n+1)) \\ &&&= \int_n^{n+1} \lambda e^{-\lambda x} \d x \\ &&&= \left [-e^{-\lambda x} \right]_n^{n+1} \\ &&&= e^{-\lambda n} - e^{-\lambda(n+1)} \\ &&&= e^{-\lambda n}(1- e^{-\lambda}) \end{align*}
$,$ \begin{align*} && \mathbb{P}(Z < z) &= \sum_{i=0}^{\infty} \mathbb{P}(X \in (n, n+z)) \\ &&&= \sum_{i=0}^{\infty} \int_{n}^{n+z} \lambda e^{-\lambda x} \d x \\ &&&= \sum_{i=0}^{\infty} [-e^{-\lambda x}]_{n}^{n+z} \\ &&&= \sum_{i=0}^{\infty} (1-e^{-\lambda x})e^{-\lambda n} \\ &&&= \frac{1-e^{-\lambda x}}{1-e^{-\lambda}} \end{align*}
Give the cdf of $Z$, we see that $f_Z(z) = \frac{\lambda e^{-\lambda z}}{1-e^{-\lambda}}$ so \begin{align*} && \E[Z] &= \int_0^1 z \frac{\lambda e^{-\lambda z}}{1-e^{-\lambda}} \d z \\ &&&= \frac{\lambda}{1-e^{-\lambda}} \int_0^1 ze^{-\lambda z} \d z \\ &&&= \frac{\lambda}{1-e^{-\lambda}} \left ( \left [-\frac{1}{\lambda} ze^{-\lambda z} \right]_0^1+\int_0^1 \frac{1}{\lambda} e^{-\lambda z} \d z \right) \\ &&&= \frac{\lambda}{1-e^{-\lambda}} \left ( -\frac{e^{-\lambda}}{\lambda} + \frac{1-e^{-\lambda}}{\lambda^2} \right) \\ &&&= \frac{1-e^{-\lambda}(1+\lambda)}{\lambda (1-e^{-\lambda})} \end{align*}
$\,$ \begin{align*} && \mathbb{P}(Y = n \text{ and }z_1 < Z < z_2)&= \mathbb{P}(X \in (n+z_1, n+z_2) ) \\ &&&= \int_{n+z_1}^{n+z_2} \lambda e^{-\lambda x} \d x \\ &&&= e^{-n\lambda}(e^{-\lambda z_1} - e^{-\lambda z_2}) \end{align*} Note that $\mathbb{P}(z_1 < Z < z_2) = \mathbb{P}( Z < z_2) -\mathbb{P}(Z< z_1) =\frac{e^{-\lambda z_1} - e^{-\lambda z_2}}{1-e^{-\lambda}}$ Therefore \begin{align*} && \mathbb{P}(Y = n \text{ and }z_1 < Z < z_2) &= e^{-n\lambda}(e^{-\lambda z_1} - e^{-\lambda z_2}) \\ &&&= e^{-\lambda n}(1-e^{-\lambda}) \frac{e^{-\lambda z_1} - e^{-\lambda z_2}}{1-e^{-\lambda}} \\ &&&= \mathbb{P}(Y=n) \mathbb{P}(z_1 < Z < z_2) \end{align*} So they are independent, which is to be expected from the memorylessness property of the exponential distribution.

View

2015 Paper 3 Q13

D: 1700.0 B: 1500.0

probability continuous distribution uniform distribution cumulative distribution function probability density function expectation transformation of variables bivariate data geometric probability

Each of the two independent random variables $X$ and $Y$ is uniformly distributed on the interval~$[0,1]$.

By considering the lines $x+y =$ $\mathrm{constant}$ in the $x$-$y$ plane, find the cumulative distribution function of $X+Y$.
Hence show that the probability density function $f$ of $(X+Y)^{-1}$ is given by \[ \f(t) = \begin{cases} 2t^{-2} -t^{-3} & \text{for $ \tfrac12 \le t \le 1$} \\ t^{-3} & \text{for $1\le t <\infty$}\\ 0 & \text{otherwise}. \end{cases} \] Evaluate $\E\Big(\dfrac1{X+Y}\Big)\,$.
Find the cumulative distribution function of $Y/X$ and use this result to find the probability density function of $\dfrac X {X+Y}$. Write down $\E\Big( \dfrac X {X+Y}\Big)$ and verify your result by integration.

Solution:

$\mathbb{P}(X + Y \leq c) $ is the area between the $x$-axis, $y$-axis and the line $x + y = c$. There are two cases for this: \[\mathbb{P}(X + Y \leq c) = \begin{cases} 0 & \text{ if } c \leq 0 \\ \frac{c^2}{2} & \text{ if } c \leq 1 \\ 1- \frac{(2-c)^2}{2} & \text{ if } 1 \leq c \leq 2 \\ 1 & \text{ otherwise} \end{cases}\]
\begin{align*} && \mathbb{P}((X + Y)^{-1} \leq t) &= 1- \mathbb{P}(X + Y \leq \frac1{t}) \\ \Rightarrow && f_{(X+Y)^{-1}}(t) &= 0 -\begin{cases} 0 & \text{ if } \frac1{t} \leq 0 \\ \frac{\d}{\d t}\frac{1}{2t^2} & \text{ if } \frac{1}{t} \leq 1 \\ \frac{\d}{\d t} \l 1- \frac{(2-\frac1t)^2}{2} \r & \text{ if } 1 \leq \frac{1}{t} \leq 2 \\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ (2-\frac1t)t^{-2} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ 2t^{-2}-t^{-3} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \end{align*} Therefore, \begin{align*} \E \Big(\dfrac1{X+Y}\Big) &= \int_{\frac12}^{\infty} t f_{(X+Y)^{-1}}(t) \, \d t \\ &= \int_{\frac12}^{1} t f_{(X+Y)^{-1}}(t) \, \d t + \int_{1}^{\infty} t f_{(X+Y)^{-1}}(t) \d t\\ &= \int_{\frac12}^{1} \l 2t^{-1} - t^{-2} \r \, \d t + \int_{1}^{\infty} t^{-2} \d t\\ &= \left [ 2 \ln (t) + t^{-1} \right]_{\frac12}^{1} + \left [ -t^{-1} \right ]_{1}^{\infty} \\ &= 1 + 2 \ln 2 -2 + 1 \\ &= 2 \ln 2 \end{align*}
\begin{align*} &&\mathbb{P} \l \frac{Y}{X} \leq c \r &= \mathbb{P}( Y \leq c X) \\ &&&= \begin{cases} 0 & \text{if } c \leq 0 \\ \frac{c}{2} & \text{if } 0 \leq c \leq 1 \\ 1-\frac{1}{2c} & \text{if } 1 \leq c \end{cases} \\ \\ \Rightarrow && \mathbb{P} \l \frac{X}{X+Y} \leq t\r &= \mathbb{P} \l \frac{1}{1+\frac{Y}{X}} \leq t\r \\ &&&= \mathbb{P} \l \frac{1}{t} \leq 1+\frac{Y}{X}\r \\ &&&= \mathbb{P} \l \frac{1}{t} - 1\leq \frac{Y}{X}\r \\ &&&= 1- \mathbb{P} \l \frac{Y}{X} \leq \frac{1}{t} - 1\r \\ &&&= 1 - \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t} - \frac{1}{2} & \text{if } 0 \leq \frac1{t} \leq 1 \\ 1-\frac{t}{2-2t} & \text{if } 1 \leq \frac1{t} \end{cases} \\ && f_{\frac{X}{X+Y}}(t) &= \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t^2} & \text{if } t \geq 1 \\ \frac{1}{2(1-t)^2} & \text{if } 0 \leq t \leq 1 \end{cases} \\ \Rightarrow && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^\infty t f(t) \d t \\ &&&= \int_0^1 \frac{1}{2(1-t)^2} \d t + \int_1^\infty \frac{1}{t^2} \d t \\ &&& = \frac{1}{4} + \frac{1}{4} = \frac{1}{2} \\ \\ && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^1 \int_0^1 \frac{x}{x+y} \d y\d x \\ &&&= \int_0^1 \l x \ln (x+1) - x \ln x \r \d x \\ &&&= \left [\frac{x^2}2 \ln(x+1) - \frac{x^2}{2} \ln(x) \right]_0^1 -\int_0^1 \l \frac{x^2}{2(x+1)} - \frac{x}{2} \r \d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x^2-1+1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x -1}{2} + \frac{1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \frac{1}{4} + \frac{1}{2} - \frac{\ln 2}{2} \\ &&&= \frac{1}{2} \end{align*} We can also notice that $1 = \mathbb{E} \l \frac{X+Y}{X+Y} \r = \mathbb{E} \l \frac{X}{X+Y} \r + \mathbb{E} \l \frac{Y}{X+Y} \r = 2 \mathbb{E} \l \frac{X}{X+Y} \r$ so it's clearly true as long as we can show that the integral converges.

View

2014 Paper 2 Q12

D: 1600.0 B: 1484.8

probability hazard function cumulative distribution function probability density function conditional probability exponential distribution curve sketching differential equation

The lifetime of a fly (measured in hours) is given by the continuous random variable $T$ with probability density function $f(t)$ and cumulative distribution function $F(t)$. The hazard function, $h(t)$, is defined, for $F(t) < 1$, by \[ h(t) = \frac{f(t)}{1-F(t)}\,. \]

Given that the fly lives to at least time $t$, show that the probability of its dying within the following $\delta t$ is approximately $h (t) \, \delta t$ for small values of $\delta t$.
Find the hazard function in the case $F(t) = t/a$ for $0< t < a$. Sketch $f(t)$ and $h(t)$ in this case.
The random variable $T$ is distributed on the interval $t > a$, where $a>0$, and its hazard function is $t^{-1}$. Determine the probability density function for $T$.
Show that $h(t)$ is constant for $t > b$ and zero otherwise if and only if $f(t) =ke^{-k(t-b)}$ for $t > b$, where $k$ is a positive constant.
The random variable $T$ is distributed on the interval $t > 0$ and its hazard function is given by \[ h(t) = \left(\frac{\lambda}{\theta^\lambda}\right)t^{\lambda-1}\,, \] where $\lambda$ and $\theta$ are positive constants. Find the probability density function for $T$.

View

2014 Paper 3 Q12

D: 1700.0 B: 1500.0

probability cumulative distribution function probability density function transformation of variables normal distribution mode median expectation inequality lognormal distribution completing the square

The random variable $X$ has probability density function $f(x)$ (which you may assume is differentiable) and cumulative distribution function $F(x)$ where $-\infty < x < \infty $. The random variable $Y$ is defined by $Y= \e^X$. You may assume throughout this question that $X$ and $Y$ have unique modes.

Find the median value $y_m$ of $Y$ in terms of the median value $x_m$ of $X$.
Show that the probability density function of $Y$ is $f(\ln y)/y$, and deduce that the mode $\lambda$ of $Y$ satisfies $\f'(\ln \lambda) = \f(\ln \lambda)$.
Suppose now that $X \sim {\rm N} (\mu,\sigma^2)$, so that \[ f(x) = \frac{1}{\sigma \sqrt{2\pi}\,} \e^{-(x-\mu)^2/(2\sigma^2)} \,. \] Explain why \[\frac{1}{\sigma \sqrt{2\pi}\,} \int_{-\infty}^{\infty}\e^{-(x-\mu-\sigma^2)^2/(2\sigma^2)} \d x = 1 \] and hence show that $ \E(Y) = \e ^{\mu+\frac12\sigma^2}$.
Show that, when $X \sim {\rm N} (\mu,\sigma^2)$, \[ \lambda < y_m < \E(Y)\,. \]

View

2013 Paper 3 Q13

D: 1700.0 B: 1484.0

cumulative distribution function probability density function integration by parts inequality expectation moments continuous random variable bounding

The continuous random variable $X$ satisfies $0\le X\le 1$, and has probability density function $\f(x)$ and cumulative distribution function $\F(x)$. The greatest value of $\f(x)$ is $M$, so that $0\le \f(x) \le M$.
1. Show that $0\le \F(x) \le Mx$ for $0\le x\le1$.
2. For any function $\g(x)$, show that \[ \int_0^1 2 \g(x) \F(x) \f(x) \d x = \g(1) - \int_0^1 \g'(x) \big( \F(x)\big)^2 \d x \,. \]
The continuous random variable $Y$ satisfies $0\le Y\le 1$, and has probability density function $k \F(y) \f(y)$, where $\f$ and $\F$ are as above.
1. Determine the value of the constant $k$.
2. Show that \[ 1+ \frac{nM}{n+1}\mu_{n+1} - \frac{nM}{n+1} \le \E(Y^n) \le 2M\mu_{n+1}\,, \] where $\mu_{n+1} = \E(X^{n+1})$ and $n\ge0$.
3. Hence show that, for $n\ge 1$, \[ \mu _n \ge \frac{n}{(n+1)M} -\frac{n-1}{n+1} \,.\]

View

2012 Paper 3 Q13

D: 1700.0 B: 1484.0

normal distribution conditional expectation truncated distribution absolute value integration variance cumulative distribution function standard normal

The random variable $Z$ has a Normal distribution with mean $0$ and variance $1$. Show that the expectation of $Z$ given that $a < Z < b$ is \[ \frac{\exp(- \frac12 a^2) - \exp(- \frac12 b^2) } {\sqrt{2\pi\,} \,\big(\Phi(b) - \Phi(a)\big)}, \] where $\Phi$ denotes the cumulative distribution function for $Z$.
The random variable $X$ has a Normal distribution with mean $\mu$ and variance $\sigma^2$. Show that \[ \E(X \,\vert\, X>0) = \mu + \sigma \E(Z \,\vert\,Z > -\mu/\sigma). \] Hence, or otherwise, show that the expectation, $m$, of $\vert X\vert $ is given by \[ m= \mu \big(1 - 2 \Phi(- \mu / \sigma)\big) + \sigma \sqrt{2 / \pi}\; \exp(- \tfrac12 \mu^2 / \sigma^2) \,. \] Obtain an expression for the variance of $\vert X \vert$ in terms of $\mu $, $\sigma $ and $m$.

View

2005 Paper 2 Q14

D: 1600.0 B: 1469.5

normal distribution probability density function mixture distribution uniform distribution mean and variance cumulative distribution function curve sketching standard normal

The probability density function $\f(x)$ of the random variable $X$ is given by $$\f(x) = k\left[{\phi}(x) + {\lambda}\g(x)\right]$$ where ${\phi}(x)$ is the probability density function of a normal variate with mean 0 and variance 1, $\lambda $ is a positive constant, and $\g(x)$ is a probability density function defined by \[ \g(x)= \begin{cases} 1/\lambda & \mbox{for $0 \le x \le {\lambda}$}\,;\\ 0& \mbox{otherwise} . \end{cases} \] Find $\mu$, the mean of $X$, in terms of $\lambda$, and prove that $\sigma$, the standard deviation of $X$, satisfies. $$\sigma^2 = \frac{\lambda^4 +4{\lambda}^3+12{\lambda}+12} {12(1 + \lambda )^2}\;.$$ In the case $\lambda=2$:

draw a sketch of the curve $y=\f(x)$;
express the cumulative distribution function of $X$ in terms of $\Phi(x)$, the cumulative distribution function corresponding to $\phi(x)$;
evaluate $\P(0 < X < \mu+2\sigma)$, given that $\Phi (\frac 23 + \frac23 \surd7)=0.9921$.

View

2004 Paper 2 Q12

D: 1600.0 B: 1516.0

probability density function cumulative distribution function curve sketching mode median conditional probability integration exponential function normalisation inequality

Sketch the graph, for $x \ge 0\,$, of $$ y = kx\e^{-ax^2} \;, $$ where $a$ and $k$ are positive constants. The random variable $X$ has probability density function $\f(x)$ given by \begin{equation*} \f(x)= \begin{cases} kx\e^{-ax^2} & \text{for $0 \le x \le 1$}\\[3pt] 0 & \text{otherwise}. \end{cases} \end{equation*} Show that $\displaystyle k=\frac{2a}{1-\e^{-a}}$ and find the mode $m$ in terms of $a\,$, distinguishing between the cases $a < \frac12$ and $a > \frac12\,$. Find the median $h$ in terms of $a$, and show that $h > m$ if $a > -\ln\left(2\e^{-1/2} - 1\right).$ Show that, $-\ln\left(2\e^{-1/2}-1\right)> \frac12 \,$. Show also that, if $a > -\ln\left(2\e^{-1/2} - 1\right) \,$, then $$ P(X > m \;\vert\; X < h) = {{2\e^{-1/2}-\e^{-a}-1} \over 1-\e^{-a}}\;. $$

Solution:

\begin{align*} && 1 &= \int_0^1 f(x) \d x \\ &&&= \int_0^1 kx e^{-ax^2} \d x \\ &&&= \left [-\frac{k}{2a}e^{-ax^2} \right]_0^1 \\ &&&= \frac{k(1-e^{-a})}{2a} \\ \Rightarrow && k &= \frac{2a}{1-e^{-a}} \end{align*} To find the mode, we want $f'(x) = 0$, ie \begin{align*} && 0 &= f'(x) \\ &&&= -2kax^2e^{-ax^2} + k e^{-ax^2} \\ &&&= ke^{-ax^2} \left (1-2ax^2 \right)\\ \end{align*} So either $m = \frac{1}{\sqrt{2a}}$ (if $a > \frac12$) or $f(x)$ is increasing and the mode is $m = 1$ (if $a < \frac12$). \begin{align*} && \frac12 &= \int_0^h f(x) \d x \\ &&&= \left [ -\frac{e^{-ax^2}}{1-e^{-a}} \right]_0^h \\ &&&= \frac{1-e^{-ah^2}}{1-e^{-a}} \\ \Rightarrow && e^{-ah^2}&= 1-\frac12(1-e^{-a}) \\ \Rightarrow && -a h^2 &= \ln \left ( \frac12(1+e^{-a}) \right) \\ \Rightarrow && h &= \sqrt{-\frac1a \ln (\tfrac12(1+e^{-a}))} \end{align*} $h > m$ already means $a > \frac12$ so \begin{align*} && h &> m \\ \Leftrightarrow &&\sqrt{-\frac1a \ln (\tfrac12(1+e^{-a}))} &> \frac{1}{\sqrt{2a}} \\ \Leftrightarrow && -\ln (\tfrac12(1+e^{-a})) &> \frac12 \\ \Leftrightarrow && e^{-1/2} & > \frac12(1+e^{-a}) \\ \Leftrightarrow && 2e^{-1/2}-1 &>e^{-a} \\ \Leftrightarrow && \ln(2e^{-1/2}-1) &>-a \\ \Leftrightarrow && a& > -\ln(2e^{-1/2}-1) \\ \end{align*} Noting that \begin{align*} && -\ln(2e^{-1/2} - 1) &= -\ln \left (\frac{2-\sqrt{e}}{e^{1/2}} \right) \\ &&&= \frac12 -\ln(\underbrace{2 - \sqrt{e}}_{<1}) \\ &&&> \frac12 \end{align*} If $a > -\ln(2e^{-1/2}-1)$ then \begin{align*} && \mathbb{P}(X > m | X < h) &= \frac{\mathbb{P}(m < X < h)}{\mathbb{P}(X < h)} \\ &&&= \frac{e^{-am^2}-e^{-ah^2}}{1-e^{-ah^2}} \\ &&&= \frac{e^{-a\frac{1}{2a}}-e^{\ln \left ( \frac12(1+e^{-a}) \right)}}{1-e^{\ln \left ( \frac12(1+e^{-a}) \right)}} \\ &&&= \frac{e^{-1/2}-\frac12(1+e^{-a})}{1-\frac12(1+e^{-a})} \\ &&&= \frac{2e^{-1/2}-1-e^{-a}}{1-e^{-a}} \end{align*} as required.

View

2002 Paper 2 Q12

D: 1600.0 B: 1500.6

probability approximating binomial to normal binomial distribution normal approximation continuity correction nested binomial cumulative distribution function critical evaluation

On $K$ consecutive days each of $L$ identical coins is thrown $M$ times. For each coin, the probability of throwing a head in any one throw is $p$ (where $0 < p < 1$). Show that the probability that on exactly $k$ of these days more than $l$ of the coins will each produce fewer than $m$ heads can be approximated by \[ {K \choose k}q^k(1-q)^{K-k}, \] where \[ q=\Phi\left( \frac{2h-2l-1}{2\sqrt{h} }\right), \ \ \ \ \ \ h=L\Phi\left( \frac{2m-1-2Mp}{2\sqrt{ Mp(1-p)}}\right) \] and $\Phi(\cdot)$ is the cumulative distribution function of a standard normal variate. Would you expect this approximation to be accurate in the case $K=7$, $k=2$, $L=500$, $l=4$, $M=100$, $m=48$ and $p=0.6\;$?

Solution: Let $H_i$ be the random variable of how many heads the $i$th coin throws on a given day. Then $H_i \sim B(M,p)$, and the probability that a given coin produces fewer than $m$ heads is $p_h = \P(H_i < m)$ Let $C$ be the random variable the number of coins producing fewer than $m$ heads, then $C \sim B(L, p_h)$. The probability that more than $l$ of the coins produce fewer than $m$ heads is therefore $\P(C > l)$. Finally, the probability that on exactly $k$ days more than $l$ of the coins will produce fewer than $m$ heads is: \[ \binom{K}{k} \cdot \P(C > l)^k \cdot (1-\P(C > l))^{K-k} \] Let's start by assuming that all our Binomials can be approximated by a normal distribution. $B(M,p) \approx N(Mp, Mp(1-p))$ and so: \begin{align*} p_h &= \P(H_i < m) \\ &\approx \P( \sqrt{Mp(1-p)}Z+Mp < m-\frac12) \\ &= \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \\ &= \Phi\l\frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r \end{align*} $B(L, p_h) \approx B \l L, \P \l Z < \frac{2m-2Mp-1}{2\sqrt{Mp(1-p)}} \r\r = B(L, \frac{h}{L}) \approx N(h, \frac{h(L-h)}{L})$ Therefore \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1- \P \l \sqrt{\frac{h(L-h)}{L}} Z + h \leq l+\frac12 \r \\ &= 1 - \P \l Z \leq \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}}\r \\ &= 1- \Phi\l \frac{2l-2h+1}{2\sqrt{\frac{h(L-h)}{L}}} \r \\ &= \Phi\l \frac{2h-2l-1}{2\sqrt{\frac{h(L-h)}{L}}} \r \end{align*} If we can approximate $\sqrt{1-\frac{h}{L}}$ by $1$ then we obtain the approximation in the question. Alternatively, $B(L, \frac{h}{L}) \approx Po(h)$ and $Po(h) \approx N(h,h)$ so we obtain: \begin{align*} \P(C > l) &= 1-\P(C \leq l) \\ &\approx 1 - \P(\sqrt{h} Z +h < l + \frac12) \\ &= 1 - \P \l Z < \frac{2l-2h+1}{2\sqrt{h}} \r \\ &= \Phi \l \frac{2h - 2l -1}{2\sqrt{h}}\r \end{align*} as required. [I think this is what the examiners expected]. Considering the case $K=7$, $k=2$, $L=500$, $l=4$, $M=100$, $m=48$ and $p=0.6$, we have the first normal approximation depends on $Mp$ and $M(1-p)$ being large. They are $60$ and $40$ respectively, so this is likely a good approximation. The first approximation finds that \begin{align*} h &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{2 \cdot 48 - 2 \cdot 60 - 1}{2\sqrt{24}} \r \\ &= 500 \cdot \Phi \l \frac{-25}{2 \sqrt{24}} \r \\ &\approx 500 \cdot \Phi (-2.5) \\ &= 500 \cdot 0.0062 \\ &\approx 3.1 \end{align*} The second binomial approximation will be good if $500 \cdot \frac{3.1}{500} = 3.1$ is large, but this is quite small. Therefore, we shouldn't expect this to be a good approximation. However, since $m = 48$ is far from the mean (in a normalised sense), we might expect the percentage error to be large. [Alternatively, using what I expect the desired approach] The approximation of $B(L, \frac{h}{L}) \approx Po(h)$ is acceptable since $n>50$ and $h < 5$. The approximation of $Po(h) \sim N(h,h)$ is not acceptable since $h$ is small (in particular $h < 15$) Finally, we can compute all these values exactly using a modern calculator. \begin{array}{l|cc} & \text{correct} & \text{approx} \\ \hline p_h & 0.005760\ldots & 0.005362\ldots \\ \P(C > l) & 0.164522\ldots & 0.133319\ldots \\ \text{ans} & 0.231389\ldots & 0.182516\ldots \end{array} We can also see how the errors propagate, by doing the calculations assuming the previous steps are correct, and also including the Poisson step. \begin{array}{lccc} & \text{correct} & \text{approx} & \text{using approx } p_h \\ \hline p_h & 0.005760\ldots & 0.005362\ldots & - \\ \P(C > l)\quad [Po(h)] & 0.164522\ldots & 0.165044\ldots & 0.134293\ldots \\ \P(C > l)\quad [N(h,h)] & 0.164522\ldots & 0.169953\ldots & 0.133319\ldots \\ \P(C > l)\quad [N(h,h(1-\frac{h}{L})] & 0.164522\ldots & 0.169255\ldots & 0.132677\ldots \\ \text{ans} & 0.231389\ldots & 0.231389\ldots \end{array} By doing this, we discover that the largest errors are actually coming not from approximating the second approximation but from the small absolute (but large relative error) in the first approximation. This is, in fact, a coincidence; we can observe it by investigating the specific values being used. The first approximation looks as follows:

You might not be able to tell, but there's actually two plots on this chart. However, let's zoom in on the area we are worried about:

We can see there are small differences, which could be large in percentage terms. (As we found when we computed them directly).

First, we can immediately see that if we just look at the distribution of $B(L, p_h)$ and $B(L, p_{h_\text{approx}})$ we get quite different results, even before we do any approximations.

If we plot the probability distribution of $B(L, p_h)$ vs $N(Lp_h, Lp_h(1-p_h))$ we find that it is not a great approximation.

However, the CDF happens to be a very good approximation *just* for the value we care about. Very lucky, but not possible for someone sitting STEP to know at the time!

View

2002 Paper 2 Q13

D: 1600.0 B: 1484.0

cumulative distribution function probability expectation variance inequality integration by parts random variable transformation

Let $\F(x)$ be the cumulative distribution function of a random variable $X$, which satisfies $\F(a)=0$ and $\F(b)=1$, where $a>0$. Let \[ \G(y) = \frac{\F(y)}{2-\F(y)}\;. \] Show that $\G(a)=0\,$, $\G(b)=1\,$ and that $\G'(y)\ge0\,$. Show also that \[ \frac12 \le \frac2{(2-\F(y))^2} \le 2\;. \] The random variable $Y$ has cumulative distribution function $\G(y)\,$. Show that \[ { \tfrac12} \,\E(X) \le \E(Y) \le 2 \E(X) \;, \] and that \[ \var(Y) \le 2\var(X) +\tfrac 74 \big(\E(X)\big)^2\;. \]

View

1997 Paper 2 Q13

D: 1600.0 B: 1516.0

geometric probability Buffon's needle conditional probability cumulative distribution function trigonometric integration parallel lines

A needle of length two cm is dropped at random onto a large piece of paper ruled with parallel lines two cm apart.

By considering the angle which the needle makes with the lines, find the probability that the needle crosses the nearest line given that its centre is $x$ cm from it, where $0 < x < 1$.
Given that the centre of the needle is $x$ cm from the nearest line and that the needle crosses that line, find the cumulative distribution function for the length of the shorter segment of the needle cut off by the line.
Find the probability that the needle misses all the lines.

View

1992 Paper 3 Q15

D: 1700.0 B: 1500.0

cumulative distribution function probability density function geometric probability circular region median conditional probability area calculation uniform distribution

A goat $G$ lies in a square field $OABC$ of side $a$. It wanders randomly round its field, so that at any time the probability of its being in any given region is proportional to the area of this region. Write down the probability that its distance, $R$, from $O$ is less than $r$ if $0 < r\leqslant a,$ and show that if $r\geqslant a$ the probability is \[ \left(\frac{r^{2}}{a^{2}}-1\right)^{\frac{1}{2}}+\frac{\pi r^{2}}{4a^{2}}-\frac{r^{2}}{a^{2}}\cos^{-1}\left(\frac{a}{r}\right). \] Find the median of $R$ and probability density function of $R$. The goat is then tethered to the corner $O$ by a chain of length $a$. Find the conditional probability that its distance from the fence $OC$ is more than $a/2$.

View

Problems

Filters