Bivariate data - UFM Statistics

2018 Paper 3 Q12

D: 1700.0 B: 1516.0

A random process generates, independently, $n$ numbers each of which is drawn from a uniform (rectangular) distribution on the interval 0 to 1. The random variable $Y_k$ is defined to be the $k$th smallest number (so there are $k-1$ smaller numbers).

Show that, for $0\le y\le1\,$, \[ {\rm P}\big(Y_k\le y) =\sum^{n}_{m=k}\binom{n}{m}y^{m}\left(1-y\right)^{n-m} . \tag{$*$} \]
Show that \[ m\binom n m = n \binom {n-1}{m-1} \] and obtain a similar expression for $\displaystyle (n-m) \, \binom n m\,$. Starting from $(*)$, show that the probability density function of $Y_k$ is \[ n\binom{ n-1}{k-1} y^{k-1}\left(1-y\right)^{ n-k} \,.\] Deduce an expression for $\displaystyle \int_0^1 y^{k-1}(1-y)^{n-k} \, \d y \,$.
Find $\E(Y_k) $ in terms of $n$ and $k$.

Show Solution

2017 Paper 3 Q12

D: 1700.0 B: 1500.2

The discrete random variables $X$ and $Y$ can each take the values $1$, $\ldots\,$, $n$ (where $n\ge2$). Their joint probability distribution is given by \[ \P(X=x, \ Y=y) = k(x+y) \,, \] where $k$ is a constant.

Show that \[ \P(X=x) = \dfrac{n+1+2x}{2n(n+1)}\,. \] Hence determine whether $X$ and $Y$ are independent.
Show that the covariance of $X$ and $Y$ is negative.

Show Solution

2016 Paper 3 Q13

D: 1700.0 B: 1500.0

Given a random variable $X$ with mean $\mu$ and standard deviation $\sigma$, we define the kurtosis, $\kappa$, of $X$ by \[ \kappa = \frac{ \E\big((X-\mu)^4\big)}{\sigma^4} -3 \,. \] Show that the random variable $X-a$, where $a$ is a constant, has the same kurtosis as $X$.

Show by integration that a random variable which is Normally distributed with mean 0 has kurtosis 0.
Let $Y_1, Y_2, \ldots, Y_n$ be $n$ independent, identically distributed, random variables with mean 0, and let $T = \sum\limits_{r=1}^n Y_r$. Show that \[ \E(T^4) = \sum_{r=1}^n \E(Y_r^4) + 6 \sum_{r=1}^{n-1} \sum_{s=r+1}^{n} \E(Y^2_s) \E(Y^2_r) \,. \]
Let $X_1$, $X_2$, $\ldots$\,, $X_n$ be $n$ independent, identically distributed, random variables each with kurtosis $\kappa$. Show that the kurtosis of their sum is $\dfrac\kappa n\,$.

Show Solution

\begin{align*} &&\kappa_{X-a} &= \frac{\mathbb{E}\left(\left(X-a-(\mu-a)\right)^4\right)}{\sigma_{X-a}^4}-3 \\ &&&= \frac{\mathbb{E}\left(\left(X-\mu\right)^4\right)}{\sigma_X^4}-3\\ &&&= \kappa_X \end{align*}

$\,$ \begin{align*} && \kappa &= \frac{\mathbb{E}((X-\mu)^4)}{\sigma^4} - 3 \\ &&&= \frac{\mathbb{E}((\mu+\sigma Z-\mu)^4)}{\sigma^4} - 3 \\ &&&= \frac{\mathbb{E}((\sigma Z)^4)}{\sigma^4} - 3 \\ &&&= \mathbb{E}(Z^4)-3\\ &&&= \int_{-\infty}^{\infty} x^4\frac{1}{\sqrt{2\pi}} \exp \left ( - \frac12x^2 \right)\d x -3 \\ &&&= \left [\frac{1}{\sqrt{2\pi}}x^{3} \cdot \left ( -\exp \left ( - \frac12x^2 \right)\right) \right]_{-\infty}^{\infty} + \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty 3x^2 \exp \left ( - \frac12x^2 \right) \d x - 3 \\ &&&= 0 + 3 \textrm{Var}(Z) - 3 =0 \end{align*}
$\,$ \begin{align*} && \mathbb{E}(T^4) &= \mathbb{E} \left [\left ( \sum\limits_{r=1}^n Y_r\right)^4\right] \\ &&&= \mathbb{E} \left [ \sum_{r=1}^n Y_r^4+\sum_{i\neq j} 4Y_iY_j^3+\sum_{i\neq j} 6Y_i^2Y_j^2+\sum_{i\neq j \neq k} 12Y_iY_jY_k^2 +\sum_{i\neq j\neq k \neq l}24 Y_iY_jY_kY_l\right] \\ &&&= \sum_{r=1}^n \mathbb{E} \left [ Y_r^4 \right]+\sum_{i\neq j} \mathbb{E} \left [ 4Y_iY_j^3\right]+\sum_{i\neq j} \mathbb{E} \left [ 6Y_i^2Y_j^2\right]+\sum_{i\neq j \neq k} \mathbb{E} \left [ 12Y_iY_jY_k^2\right] +\sum_{i\neq j\neq k \neq l} \mathbb{E} \left [ 24 Y_iY_jY_kY_l\right] \\ &&&= \sum_{r=1}^n \mathbb{E} \left [ Y_r^4 \right]+4\sum_{i\neq j} \mathbb{E} \left [ Y_i]\mathbb{E}[Y_j^3\right]+6\sum_{i\neq j} \mathbb{E} \left [ Y_i^2]\mathbb{E}[Y_j^2\right]+12\sum_{i\neq j \neq k} \mathbb{E} \left [ Y_i]\mathbb{E}[Y_j]\mathbb{E}[Y_k^2\right] +24\sum_{i\neq j\neq k \neq l} \mathbb{E} \left [ Y_i]\mathbb{E}[Y_j]\mathbb{E}[Y_k]\mathbb{E}[Y_l\right] \\ &&&= \sum_{r=1}^n \mathbb{E} \left [ Y_r^4 \right]+6\sum_{i\neq j} \mathbb{E} \left [ Y_i^2]\mathbb{E}[Y_j^2\right] \end{align*}
Without loss of generality, we may assume they all have mean zero. Therefore we can consider the sitatuion as in the previous case with $T$ and $Y_i$s. Note that $\mathbb{E}(Y_i^4) = \sigma^4(\kappa + 3)$ and $\textrm{Var}(T) = n \sigma^2$ \begin{align*} && \kappa_T &= \frac{\mathbb{E}(T^4)}{(\textrm{Var}(T))^2} - 3 \\ &&&= \frac{\sum_{r=1}^n \mathbb{E} \left [ Y_r^4 \right]+6\sum_{i\neq j} \mathbb{E} \left [ Y_i^2\right]\mathbb{E}\left[Y_j^2\right]}{n^2\sigma^4}-3 \\ &&&= \frac{n\sigma^4(\kappa+3)+6\binom{n}{2}\sigma^4}{n^2\sigma^4} -3\\ &&&= \frac{\kappa}{n} + \frac{3n + \frac{6n(n-1)}{2}}{n^2} - 3 \\ &&&= \frac{\kappa}{n} + \frac{3n^2}{n^2}-3 \\ &&&= \frac{\kappa}{n} \end{align*}

2015 Paper 3 Q13

D: 1700.0 B: 1500.0

Each of the two independent random variables $X$ and $Y$ is uniformly distributed on the interval~$[0,1]$.

By considering the lines $x+y =$ $\mathrm{constant}$ in the $x$-$y$ plane, find the cumulative distribution function of $X+Y$.
Hence show that the probability density function $f$ of $(X+Y)^{-1}$ is given by \[ \f(t) = \begin{cases} 2t^{-2} -t^{-3} & \text{for $ \tfrac12 \le t \le 1$} \\ t^{-3} & \text{for $1\le t <\infty$}\\ 0 & \text{otherwise}. \end{cases} \] Evaluate $\E\Big(\dfrac1{X+Y}\Big)\,$.
Find the cumulative distribution function of $Y/X$ and use this result to find the probability density function of $\dfrac X {X+Y}$. Write down $\E\Big( \dfrac X {X+Y}\Big)$ and verify your result by integration.

Show Solution

$\mathbb{P}(X + Y \leq c) $ is the area between the $x$-axis, $y$-axis and the line $x + y = c$. There are two cases for this: \[\mathbb{P}(X + Y \leq c) = \begin{cases} 0 & \text{ if } c \leq 0 \\ \frac{c^2}{2} & \text{ if } c \leq 1 \\ 1- \frac{(2-c)^2}{2} & \text{ if } 1 \leq c \leq 2 \\ 1 & \text{ otherwise} \end{cases}\]
\begin{align*} && \mathbb{P}((X + Y)^{-1} \leq t) &= 1- \mathbb{P}(X + Y \leq \frac1{t}) \\ \Rightarrow && f_{(X+Y)^{-1}}(t) &= 0 -\begin{cases} 0 & \text{ if } \frac1{t} \leq 0 \\ \frac{\d}{\d t}\frac{1}{2t^2} & \text{ if } \frac{1}{t} \leq 1 \\ \frac{\d}{\d t} \l 1- \frac{(2-\frac1t)^2}{2} \r & \text{ if } 1 \leq \frac{1}{t} \leq 2 \\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ (2-\frac1t)t^{-2} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ 2t^{-2}-t^{-3} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \end{align*} Therefore, \begin{align*} \E \Big(\dfrac1{X+Y}\Big) &= \int_{\frac12}^{\infty} t f_{(X+Y)^{-1}}(t) \, \d t \\ &= \int_{\frac12}^{1} t f_{(X+Y)^{-1}}(t) \, \d t + \int_{1}^{\infty} t f_{(X+Y)^{-1}}(t) \d t\\ &= \int_{\frac12}^{1} \l 2t^{-1} - t^{-2} \r \, \d t + \int_{1}^{\infty} t^{-2} \d t\\ &= \left [ 2 \ln (t) + t^{-1} \right]_{\frac12}^{1} + \left [ -t^{-1} \right ]_{1}^{\infty} \\ &= 1 + 2 \ln 2 -2 + 1 \\ &= 2 \ln 2 \end{align*}
\begin{align*} &&\mathbb{P} \l \frac{Y}{X} \leq c \r &= \mathbb{P}( Y \leq c X) \\ &&&= \begin{cases} 0 & \text{if } c \leq 0 \\ \frac{c}{2} & \text{if } 0 \leq c \leq 1 \\ 1-\frac{1}{2c} & \text{if } 1 \leq c \end{cases} \\ \\ \Rightarrow && \mathbb{P} \l \frac{X}{X+Y} \leq t\r &= \mathbb{P} \l \frac{1}{1+\frac{Y}{X}} \leq t\r \\ &&&= \mathbb{P} \l \frac{1}{t} \leq 1+\frac{Y}{X}\r \\ &&&= \mathbb{P} \l \frac{1}{t} - 1\leq \frac{Y}{X}\r \\ &&&= 1- \mathbb{P} \l \frac{Y}{X} \leq \frac{1}{t} - 1\r \\ &&&= 1 - \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t} - \frac{1}{2} & \text{if } 0 \leq \frac1{t} \leq 1 \\ 1-\frac{t}{2-2t} & \text{if } 1 \leq \frac1{t} \end{cases} \\ && f_{\frac{X}{X+Y}}(t) &= \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t^2} & \text{if } t \geq 1 \\ \frac{1}{2(1-t)^2} & \text{if } 0 \leq t \leq 1 \end{cases} \\ \Rightarrow && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^\infty t f(t) \d t \\ &&&= \int_0^1 \frac{1}{2(1-t)^2} \d t + \int_1^\infty \frac{1}{t^2} \d t \\ &&& = \frac{1}{4} + \frac{1}{4} = \frac{1}{2} \\ \\ && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^1 \int_0^1 \frac{x}{x+y} \d y\d x \\ &&&= \int_0^1 \l x \ln (x+1) - x \ln x \r \d x \\ &&&= \left [\frac{x^2}2 \ln(x+1) - \frac{x^2}{2} \ln(x) \right]_0^1 -\int_0^1 \l \frac{x^2}{2(x+1)} - \frac{x}{2} \r \d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x^2-1+1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x -1}{2} + \frac{1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \frac{1}{4} + \frac{1}{2} - \frac{\ln 2}{2} \\ &&&= \frac{1}{2} \end{align*} We can also notice that $1 = \mathbb{E} \l \frac{X+Y}{X+Y} \r = \mathbb{E} \l \frac{X}{X+Y} \r + \mathbb{E} \l \frac{Y}{X+Y} \r = 2 \mathbb{E} \l \frac{X}{X+Y} \r$ so it's clearly true as long as we can show that the integral converges.

2013 Paper 3 Q12

D: 1700.0 B: 1500.0

A list consists only of letters $A$ and $B$ arranged in a row. In the list, there are $a$ letter $A$s and $b$ letter $B$s, where $a\ge2$ and $b\ge2$, and $a+b=n$. Each possible ordering of the letters is equally probable. The random variable $X_1$ is defined by \[ X_1 = \begin{cases} 1 & \text{if the first letter in the row is $A$};\\ 0 & \text{otherwise.} \end{cases} \] The random variables $X_k$ ($2 \le k \le n$) are defined by \[ X_k = \begin{cases} 1 & \text{if the $(k-1)$th letter is $B$ and the $k$th is $A$};\\ 0 & \text{otherwise.} \end{cases} \] The random variable $S$ is defined by $S = \sum\limits_ {i=1}^n X_i\,$.

Find expressions for $\E(X_i)$, distinguishing between the cases $i=1$ and $i\ne1$, and show that $\E(S)= \dfrac{a(b+1)}n\,$.
Show that:
1. for $j\ge3$, $\E(X_1X_j) = \dfrac{a(a-1)b}{n(n-1)(n-2)}\,$;
2. \[ \sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n \E(X_iX_j)\bigg) = \dfrac{a(a-1)b(b-1)}{2n(n-1)}\,\]
3. $\var(S) = \dfrac {a(a-1)b(b+1)}{n^2(n-1)}\,$.

Show Solution

Notice that $\E[X_1] = \frac{a}{n}$ and consider $\E[X_i]$ with $i > 1$. the probability that this is $1$ is $\frac{b}{n} \cdot \frac{a}{n-1}$. So \begin{align*} && \E[S] &= \E[X_1] + \sum_{i=2}^n \E[X_i] \\ &&&= \frac{a}{n} + (n-1) \frac{ab}{n(n-1)} \\ &&&= \frac{a(b+1)}{n} \end{align*}
1. The probability $X_1X_j = 1$ is $\frac{a}{n} \cdot \frac{b}{n-1} \cdot \frac{a-1}{n-2} = \frac{a(a-1)b}{n(n-1)(n-2)}$ since there is nothing special about the order, and the first is an $A$ with probability $\frac{a}{n}$ and given this occurs there are now $a-1$ $A$ and $n-1$ letters left etc... Therefore $\E[X_1X_j] = \frac{a(a-1)b}{n(n-1)(n-2)}$
2. $\E[X_iX_j]$ when the pairs don't overlap is $\frac{a}{n} \frac{b}{n-1} \frac{a-1}{n-2} \frac{b-1}{n-3}$, and so \begin{align*} && \sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n \E(X_iX_j)\bigg) &= \sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)}\bigg) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)}\sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n 1\bigg) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)}\sum\limits_{i=2}^{n-2} (n-(i+1)) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)} \left ((n-1)(n-3)-\frac{(n-2)(n-1)}{2}+1 \right) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)} \left ( \frac{2n^2-8n-6-n^2+3n-2+2}{2}\right) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)} \left ( \frac{n^2-5n-6}{2}\right) \\ &&&= \frac{a(a-1)b(b-1)}{2n(n-1)} \end{align*}
3. We also need to consider the other cross terms. $X_iX_{i+1}=0$. (Since $X_i = 1$ means the $i$th letter is $A$ and $X_{i+1} = 1$ means the $i$th letter is $B$). It's the same story for $X_1X_2$, and so all the cross terms are accounted for. Therefore \begin{align*} && \E[S^2] &= \E \left [\sum X_i^2 + 2\sum_{i \neq j} X_i X_j \right] \\ &&&= \frac{a(b+1)}{n} +2(n-2)\frac{a(a-1)b}{n(n-1)(n-2)}+ 2 \frac{a(a-1)b(b-1)}{2n(n-1)} \\ &&&= \frac{a(b+1)}{n} +\frac{2a(a-1)b}{n(n-1)} + \frac{a(a-1)b(b-1)}{n(n-1)} \\ &&&= \frac{a(b+1)}{n} +\frac{a(a-1)b(b+1)}{n(n-1)} \\ && \var[S] &= \E[S^2] - \left ( \E[S] \right)^2 \\ &&&= \frac{a(b+1)}{n} + \frac{a(a-1)b(b+1)}{n(n-1)} - \frac{a^2(b+1)^2}{n^2} \\ &&&= \frac{a(b+1) \left (n(n-1) + (a-1)b n -a(b+1)(n-1) \right)}{n^2(n-1)} \\ &&&= \frac{a(b+1) \left ( (n-a)(n-b-1) \right)}{n^2(n-1)} \\ &&&= \frac{a(b+1) \left ( b(a-1) \right)}{n^2(n-1)} \\ \end{align*}

2010 Paper 3 Q13

D: 1700.0 B: 1516.0

In this question, ${\rm Corr}(U,V)$ denotes the product moment correlation coefficient between the random variables $U$ and $V$, defined by \[ \mathrm{Corr}(U,V) \equiv \frac{\mathrm{Cov}(U,V)}{\sqrt{\var(U)\var(V)}}\,. \] The independent random variables $Z_1$, $Z_2$ and $Z_3$ each have expectation 0 and variance 1. What is the value of $\mathrm{Corr} (Z_1,Z_2)$? Let $Y_1 = Z_1$ and let \[ Y_2 = \rho _{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2\,, \] where $\rho_{12}$ is a given constant with $-1<\rho _{12}<1$. Find $\E(Y_2)$, $\var(Y_2)$ and $\mathrm{Corr}(Y_1, Y_2)$. Now let $Y_3 = aZ_1 + bZ_2 + cZ_3$, where $a$, $b$ and $c$ are real constants and $c\ge0$. Given that $\E(Y_3) = 0$, $\var(Y_3) = 1$, $ \mathrm{Corr}(Y_1, Y_3) =\rho^{{2}}_{13} $ and $ \mathrm{Corr}(Y_2, Y_3)= \rho^{{2}} _{23}$, express $a$, $b$ and $c$ in terms of $\rho^{2} _{23}$, $\rho^{2}_{13}$ and $\rho^{2} _{12}$. Given constants $\mu_i$ and $\sigma_i$, for $i=1$, $2$ and $3$, give expressions in terms of the $Y_i$ for random variables $X_i$ such that $\E(X_i) = \mu_i$, $\var(X_i) = \sigma_ i^2$ and $\mathrm{Corr}(X_i,X_j) = \rho_{ij}$.

Show Solution

\begin{align*} \mathrm{Corr} (Z_1,Z_2) &= \frac{\mathrm{Cov}(Z_1,Z_2)}{\sqrt{\var(Z_1)\var(Z_2)}} \\ &= \frac{\mathbb{E}(Z_1 Z_2)}{\sqrt{1 \cdot 1}} \\ &= \frac{\mathbb{E}(Z_1)\mathbb{E}(Z_2)}{\sqrt{1 \cdot 1}} \\ &= \frac{0}{1} \\ &= 0 \end{align*} \begin{align*} && \mathbb{E}(Y_2) &= \mathbb{E}(\rho_{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \mathbb{E}(\rho_{12} Z_1) + \mathbb{E}( (1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \rho_{12}\mathbb{E}( Z_1) + (1 - {\rho_{12}^2})^{ \frac12}\mathbb{E}( Z_ 2) \\ &&&= 0\\ \\ && \textrm{Var}(Y_2) &= \textrm{Var}(\rho _{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \textrm{Var}(\rho_{12} Z_1)+\textrm{Cov}(\rho_{12} Z_1,(1 - {\rho_{12}^2})^{ \frac12} Z_ 2 ) + \textrm{Var}((1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \rho_{12}^2\textrm{Var}( Z_1)+\rho_{12} (1 - {\rho_{12}^2})^{ \frac12} \textrm{Cov}(Z_1, Z_ 2 ) + (1 - {\rho_{12}^2})\textrm{Var}(Z_ 2) \\ &&&= \rho_{12}^2 + (1-\rho_{12}^2) = 1 \\ \\ && \textrm{Cov}(Y_1, Y_2) &= \mathbb{E}((Y_1-0)(Y_2-0)) \\ &&&= \mathbb{E}(Z_1 \cdot (\rho _{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2)) \\ &&&= \rho_{12} \mathbb{E}(Z_1^2) + (1-\rho_{12}^2)^{\frac12}\mathbb{E}(Z_1, Z_2) \\ &&&= \rho_{12} \\ \Rightarrow && \textrm{Corr}(Y_1, Y_2) &= \frac{\textrm{Cov}(Y_1, Y_2)}{\sqrt{\textrm{Var}(Y_1)\textrm{Var}(Y_2)}} \\ &&&= \frac{\rho_{12}}{1 \cdot 1} = \rho_{12} \end{align*} Suppose $Y_3 =aZ_1 +bZ_2+cZ_3$ with $\mathbb{E}(Y_3) = 0$ (must be true), $\textrm{Var}(Y_3) = 1 = a^2+b^2+c^2$ and $\textrm{Corr}(Y_1, Y_3) = \rho_{13}, \textrm{Corr}(Y_2, Y_3) = \rho_{23}$. \begin{align*} && \textrm{Corr}(Y_1,Y_3) &= \textrm{Cov}(Y_1, Y_3) \\ &&&= \textrm{Cov}(Z_1, aZ_1 +bZ_2+cZ_3) \\ &&&= a \\ \Rightarrow && a &= \rho_{13} \\ \\ && \textrm{Corr}(Y_2,Y_3) &= \textrm{Cov}(Y_2, Y_3) \\ &&&= \textrm{Cov}(\rho_{12}Z_1+(1-\rho_{12}^2)^\frac12Z_2, \rho_{13}Z_1 +bZ_2+cZ_3) \\ &&&= \rho_{12}\rho_{13}+(1-\rho_{12}^2)^\frac12b \\ \Rightarrow && \rho_{23} &= \rho_{12}\rho_{13}+(1-\rho_{12}^2)^\frac12b \\ \Rightarrow && b &= \frac{\rho_{23}-\rho_{12}\rho_{13}}{(1-\rho_{12}^2)^\frac12} \\ && c &= \sqrt{1-\rho_{13}^2-\frac{(\rho_{23}-\rho_{12}\rho_{13})^2}{(1-\rho_{12}^2)}} \end{align*} Finally, let $X_i = \mu_i + \sigma_i Y_i$

2007 Paper 3 Q12

D: 1700.0 B: 1487.4

I choose a number from the integers $1, 2, \ldots, (2n-1)$ and the outcome is the random variable $N$. Calculate $ \E(N)$ and $\E(N^2)$. I then repeat a certain experiment $N$ times, the outcome of the $i$th experiment being the random variable $X_i$ ($1\le i \le N$). For each $i$, the random variable $X_i$ has mean $\mu$ and variance $\sigma^2$, and $X_i$ is independent of $X_j$ for $i\ne j$ and also independent of $N$. The random variable $Y$ is defined by $Y= \sum\limits_{i=1}^NX_i$. Show that $\E(Y)=n\mu$ and that $\mathrm{Cov}(Y,N) = \frac13n(n-1)\mu$. Find $\var(Y) $ in terms of $n$, $\sigma^2$ and $\mu$.

Show Solution

2006 Paper 3 Q14

D: 1700.0 B: 1516.0

For any random variables $X_1$ and $X_2$, state the relationship between $\E(aX_1+bX_2)$ and $\E(X_1)$ and $\E(X_2)$, where $a$ and $b$ are constants. If $X_1$ and $X_2$ are independent, state the relationship between $\E(X_1X_2)$ and $\E(X_1)$ and $\E(X_2)$. An industrial process produces rectangular plates. The length and the breadth of the plates are modelled by independent random variables $X_1$ and $X_2$ with non-zero means $\mu_1$ and $\mu_2$ and non-zero standard deviations $\sigma_1$ and $\sigma_2$, respectively. Using the results in the paragraph above, and without quoting a formula for $\var(aX_1+bX_2)$, find the means and standard deviations of the perimeter $P$ and area $A$ of the plates. Show that $P$ and $A$ are not independent. The random variable $Z$ is defined by $Z=P-\alpha A$, where $\alpha $ is a constant. Show that $Z$ and $A$ are not independent if \[ \alpha \ne \dfrac{2(\mu_1^{\vphantom2} \sigma_2^2 +\mu_2^{\vphantom2}\sigma_1^2)} { \mu_1^2 \sigma_2^2 +\mu_2^2\sigma_1^2 + \sigma_1^2\sigma_2^2 } \;. \] Given that $X_1$ and $X_2$ can each take values 1 and 3 only, and that they each take these values with probability $\frac 12$, show that $Z$ and $A$ are not independent for any value of $\alpha$.

Show Solution

$\E(aX_1+bX_2) = a \E(X_1) + b\E(X_2)$ for any $X_1, X_2$ $\E(X_1X_2)=\E(X_1)\E(X_2)$. if $X_1, X_2$ are independent. \begin{align*} && \E(P) &= \E(2(X_1+X_2)) = 2(\E[X_1]+\E[X_2]) \\ &&&= 2(\mu_1 + \mu_2) \\ && \var(P) &= \E[\left ( 2(X_1+X_2) \right)^2] - \E[2(X_1+X_2)]^2 \\ &&&= 4\E[X_1^2+2X_1X_2+X_2^2] -4(\mu_1 + \mu_2)^2 \\ &&&= 4(\mu_1^2 + \sigma_1^2 + 2\mu_1\mu_2 + \mu_2^2 + \sigma_2^2) - 4(\mu_1 + \mu_2)^2 \\ &&&= 4(\sigma_1^2+\sigma_2^2) \\ && \textrm{SD}(P) &= 2 \sqrt{\sigma_1^2+\sigma_2^2}\\ \\ && \E(A) &= \E[X_1X_2] = \E[X_1]\E[X_2] \\ &&&= \mu_1\mu_2 \\ && \var(A) &= \E[(X_1X_2)^2] - (\mu_1\mu_2)^2 \\ &&&= (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) - (\mu_1\mu_2)^2\\ &&&= \mu_1^2 \sigma_2^2 + \mu_2^2 \sigma_1^2 + \sigma_1^2 \sigma_2^2\\ && \textrm{SD}(A) &= \sqrt{\mu_1^2 \sigma_2^2 + \mu_2^2 \sigma_1^2 + \sigma_1^2 \sigma_2^2} \end{align*} \begin{align*} \E[PA] &= \E[2(X_1+X_2)X_1X_2] \\ &= 2\E[X_1^2X_2] + 2\E[X_1X_2^2]\\ &= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2)\\ &\neq 2(\mu_1 + \mu_2)\mu_1\mu_2 \\ &= \E[P]\E[A] \end{align*} \begin{align*} && \E[Z] &= \E[P] - \alpha \E[A] \\ &&&= 2(\mu_1+\mu_2) - \alpha \mu_1 \mu_2 \\ \\ && \E[ZA] &= \E[PA - \alpha A^2] \\ &&&= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha \E[A^2] \\ &&&= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha \E[X_1^2]\E[X_2^2] \\ &&&= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) \\ \text{if ind.} && \E[Z]\E[A] &= \E[ZA]\\ && (2(\mu_1+\mu_2) - \alpha \mu_1 \mu_2) \mu_1\mu_2 &= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) \\ \Rightarrow && 2(\mu_1^2\mu_2+\mu_1\mu_2^2) - \alpha \mu_1^2\mu_2^2 &= 2(\mu_1^2\mu_2+\mu_1\mu_2^2) + 2\sigma_1^2\mu_2 + 2\sigma_2^2\mu_1 - \alpha (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) \\ \Rightarrow && \alpha ((\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) - \mu_1^2\mu_2^2) &= 2(\sigma_1^2\mu_2 + \sigma_2^2\mu_1) \\ \Rightarrow && \alpha &= \frac{ 2(\sigma_1^2\mu_2 + \sigma_2^2\mu_1) }{\mu_1^2 \sigma_2^2 + \mu_2^2 \sigma_1^2 + \sigma_1^2 \sigma_2^2} \end{align*} Therefore if they are not independent if $\alpha \neq $ the expression. \begin{array}{c|c|c|c|c|c} & X_1 & X_2 & A & P & Z \\ \hline 0.25 & 1 & 1 & 1 & 4 & 4-\alpha \\ 0.25 & 1 & 3 & 3 & 8 & 8-3\alpha \\ 0.25 & 3 & 1 & 3 & 8 & 8-3\alpha \\ 0.25 & 3 & 3 & 9 & 12 & 12-9\alpha \\ \end{array} If $\mathbb{P}(A = 1, Z = 4-\alpha) = \mathbb{P}(A = 1)\mathbb{P}(Z = 4-\alpha)$ then $\mathbb{P}(Z = 4-\alpha) = 1$, but that mean $4-\alpha = 8-3\alpha = 12-9\alpha$ which is not a consistent set of equations as the first two are solved by $\alpha = 2$ and the second by $\alpha = \frac23$

2005 Paper 3 Q12

D: 1700.0 B: 1516.0

Five independent timers time a runner as she runs four laps of a track. Four of the timers measure the individual lap times, the results of the measurements being the random variables $T_1$ to $T_4$, each of which has variance $\sigma^2$ and expectation equal to the true time for the lap. The fifth timer measures the total time for the race, the result of the measurement being the random variable $T$ which has variance $\sigma^2$ and expectation equal to the true race time (which is equal to the sum of the four true lap times). Find a random variable $X$ of the form $aT+b(T_1+T_2+T_3+T_4)$, where $a$ and $b$ are constants independent of the true lap times, with the two properties:

whatever the true lap times, the expectation of $X$ is equal to the true race time;
the variance of $X$ is as small as possible.

Find also a random variable $Y$ of the form $cT+d(T_1+T_2+T_3+T_4)$, where $c$ and $d$ are constants independent of the true lap times, with the property that, whatever the true lap times, the expectation of $Y^2$ is equal to $\sigma^2$. In one particular race, $T$ takes the value 220 seconds and $(T_1 + T_2 + T_3 + T_4)$ takes the value $220.5$ seconds. Use the random variables $X$ and $Y$ to estimate an interval in which the true race time lies.

Show Solution

2004 Paper 3 Q12

D: 1700.0 B: 1500.0

A team of $m$ players, numbered from $1$ to $m$, puts on a set of a $m$ shirts, similarly numbered from $1$ to $m$. The players change in a hurry, so that the shirts are assigned to them randomly, one to each player. Let $C_i$ be the random variable that takes the value $1$ if player $i$ is wearing shirt $i$, and 0 otherwise. Show that $\mathrm{E}\left(C_1\right)={1 \over m}$ and find $\var \left(C_1\right)$ and $\mathrm{Cov}\left(C_1 \, , \; C_2 \right) \,$. Let $\, N = C_1 + C_2 + \cdots + C_m \,$ be the random variable whose value is the number of players who are wearing the correct shirt. Show that $\mathrm{E}\left(N\right)= \var \left(N\right) = 1 \,$. Explain why a Normal approximation to $N$ is not likely to be appropriate for any $m$, but that a Poisson approximation might be reasonable. In the case $m = 4$, find, by listing equally likely possibilities or otherwise, the probability that no player is wearing the correct shirt and verify that an appropriate Poisson approximation to $N$ gives this probability with a relative error of about $2\%$. [Use $\e \approx 2\frac{72}{100} \,$.]

Show Solution

2002 Paper 3 Q14

D: 1700.0 B: 1500.0

Prove that, for any two discrete random variables $X$ and $Y$, \[ \mathrm{Var} \left(X + Y \right) = \mathrm{Var}(X) + \mathrm{Var}(Y) + 2 \, \mathrm{Cov}(X,Y), \] where $\mathrm{Var}(X)$ is the variance of $X$ and $\mathrm{Cov}(X,Y)$ is the covariance of $X$ and $Y$. When a Grandmaster plays a sequence of $m$ games of chess, she is, independently, equally likely to win, lose or draw each game. If the values of the random variables $W$, $L$ and $D$ are the numbers of her wins, losses and draws respectively, justify briefly the following claims:

$W + L + D$ has variance $0\,$;
$W + L$ has a binomial distribution.

Find the value of $\displaystyle {\mathrm{Cov}(W,L) \over \sqrt{\mathrm{Var}(W) \mathrm{Var}(L)}}\;$.

Show Solution

2000 Paper 3 Q14

D: 1700.0 B: 1500.0

The random variable $X$ takes only the values $x_1$ and $x_2$ (where $ x_1 \not= x_2 $), and the random variable $Y$ takes only the values $y_1$ and $y_2$ (where $y_1 \not= y_2$). Their joint distribution is given by $$ \P ( X = x_1 , Y = y_1 ) = a \ ; \ \ \P ( X = x_1 , Y = y_2 ) = q - a \ ; \ \ \P ( X = x_2 , Y = y_1 ) = p - a \ . $$ Show that if $\E(X Y) = \E(X)\E(Y)$ then $$ (a - p q ) ( x_1 - x_2 ) ( y_1 - y_2 ) = 0 . $$ Hence show that two random variables each taking only two distinct values are independent if $\E(X Y) = \E(X) \E(Y)$. Give a joint distribution for two random variables $A$ and $B$, each taking the three values $- 1$, $0$ and $1$ with probability ${1 \over 3}$, which have $\E(A B) = \E( A)\E (B)$, but which are not independent.

Show Solution

1997 Paper 3 Q14

D: 1700.0 B: 1516.0

An industrial process produces rectangular plates of mean length $\mu_{1}$ and mean breadth $\mu_{2}$. The length and breadth vary independently with non-zero standard deviations $\sigma_{1}$ and $\sigma_{2}$ respectively. Find the means and standard deviations of the perimeter and of the area of the plates. Show that the perimeter and area are not independent.

Show Solution

1995 Paper 3 Q12

D: 1700.0 B: 1484.0

The random variables $X$ and $Y$ are independently normally distributed with means 0 and variances 1. Show that the joint probability density function for $(X,Y)$ is \[ \mathrm{f}(x,y)=\frac{1}{2\pi}\mathrm{e}^{-\frac{1}{2}(x^{2}+y^{2})}\qquad-\infty < x < \infty,-\infty < y < \infty. \] If $(x,y)$ are the coordinates, referred to rectangular axes, of a point in the plane, explain what is meant by saying that this density is radially symmetrical. The random variables $U$ and $V$ have a joint probability density function which is radially symmetrical (in the above sense). By considering the straight line with equation $U=kV,$ or otherwise, show that \[ \mathrm{P}\left(\frac{U}{V} < k\right)=2\mathrm{P}(U < kV,V > 0). \] Hence, or otherwise, show that the probability density function of $U/V$ is \[ \mathrm{g}(k)=\frac{1}{\pi(1+k^{2})}\qquad-\infty < k < \infty. \]

1991 Paper 3 Q16

D: 1700.0 B: 1504.3

The random variables $X$ and $Y$ take integer values $x$ and $y$ respectively which are restricted by $x\geqslant1,$ $y\geqslant1$ and $2x+y\leqslant2a$ where $a$ is an integer greater than 1. The joint probability is given by \[ \mathrm{P}(X=x,Y=y)=c(2x+y), \] where $c$ is a positive constant, within this region and zero elsewhere. Obtain, in terms of $x,c$ and $a,$ the marginal probability $\mathrm{P}(X=x)$ and show that \[ c=\frac{6}{a(a-1)(8a+5)}. \] Show that when $y$ is an even number the marginal probability $\mathrm{P}(Y=y)$ is \[ \frac{3(2a-y)(2a+2+y)}{2a(a-1)(8a+5)} \] and find the corresponding expression when $y$ is off. Evaluate $\mathrm{E}(Y)$ in terms of $a$.

1990 Paper 3 Q16

D: 1700.0 B: 1484.0

A rod of unit length is cut into pieces of length $X$ and $1-X$; the latter is then cut in half. The random variable $X$ is uniformly distributed over $[0,1].$ For some values of $X$ a triangle can be formed from the three pieces of the rod. Show that the conditional probability that, if a triangle can be formed, it will be obtuse-angled is $3-2\sqrt{2.}$
The bivariate distribution of the random variables $X$ and $Y$ is uniform over the triangle with vertices $(1,0),(1,1)$ and $(0,1).$ A pair of values $x,y$ is chosen at random from this distribution and a (perhaps degenerate) triangle $ABC$ is constructed with $BC=x$ and $CA=y$ and $AB=2-x-y.$ Show that the construction is always possible and that $\angle ABC$ is obtuse if and only if \[ y>\frac{x^{2}-2x+2}{2-x}. \] Deduce that the probability that $\angle ABC$ is obtuse is $3-4\ln2.$

Show Solution