Problems

Filters
Clear Filters

12 problems found

2017 Paper 3 Q12
D: 1700.0 B: 1500.2

The discrete random variables \(X\) and \(Y\) can each take the values \(1\), \(\ldots\,\), \(n\) (where \(n\ge2\)). Their joint probability distribution is given by \[ \P(X=x, \ Y=y) = k(x+y) \,, \] where \(k\) is a constant.

  1. Show that \[ \P(X=x) = \dfrac{n+1+2x}{2n(n+1)}\,. \] Hence determine whether \(X\) and \(Y\) are independent.
  2. Show that the covariance of \(X\) and \(Y\) is negative.


Solution:

  1. \(\,\) \begin{align*} && \mathbb{P}(X = x) &= \sum_{y=1}^n \mathbb{P}(X=x,Y=y) \\ &&&= \sum_{y=1}^n k(x+y) \\ &&&= nkx + k\frac{n(n+1)}2 \\ \\ && 1 &= \sum_{x=1}^n \mathbb{P}(X=x) \\ &&&= nk\frac{n(n+1)}{2} + kn\frac{n(n+1)}2 \\ &&&= kn^2(n+1) \\ \Rightarrow && k &= \frac{1}{n^2(n+1)} \\ \Rightarrow && \mathbb{P}(X = x) &= \frac{nx}{n^2(n+1)} + \frac{n(n+1)}{2n^2(n+1)} \\ &&&= \frac{n+1+2x}{2n(n+1)} \\ \\ && \mathbb{P}(X=x)\mathbb{P}(Y=y) &= \frac{(n+1)^2+2(n+1)(x+y)+4xy}{4n^2(n+1)^2} \\ &&&\neq \frac{x+y}{n^2(n+1)} \end{align*} Therefore \(X\) and \(Y\) are not independent.
  2. \(\,\) \begin{align*} && \E[X] &= \sum_{x=1}^n x \mathbb{P}(X=x) \\ &&&= \sum_{x=1}^n x \mathbb{P}(X=x)\\ &&&= \sum_{x=1}^n x \frac{n+1+2x}{2n(n+1)} \\ &&&= \frac{1}{2n(n+1)} \left ( (n+1) \sum x + 2\sum x^2\right)\\ &&&= \frac{1}{2n(n+1)} \left ( \frac{n(n+1)^2}{2} + \frac{n(n+1)(2n+1)}{3} \right) \\ &&&= \frac{1}{2} \left ( \frac{n+1}{2} + \frac{2n+1}{3} \right)\\ &&&= \frac{1}{2} \left ( \frac{7n+5}{6} \right)\\ &&&= \frac{7n+5}{12} \\ \\ && \textrm{Cov}(X,Y) &= \mathbb{E}\left[XY\right] - \E[X] \E[Y] \\ &&&= \sum_{x=1}^n \sum_{y=1}^n xy \frac{x+y}{n^2(n+1)} - \E[X]^2 \\ &&&= \frac{1}{n^2(n+1)} \sum \sum (x^2 y+xy^2) - \E[X]^2 \\ &&&= \frac{1}{n^2(n+1)} \left (\sum y \right )\left (\sum x^2\right ) - \E[X]^2 \\ &&&=\frac{(n+1)(2n+1)}{12} - \left ( \frac{7n+5}{12}\right)^2 \\ &&&= \frac1{144} \left (12(2n^2+3n+1) - (49n^2+70n+25) \right)\\ &&&= \frac{1}{144} \left (-25n^2-34n-13 \right) \\ &&& < 0 \end{align*} since \(\Delta = 34^2 - 4 \cdot 25 \cdot 13 = 4(17^2-25 \times 13) = -4 \cdot 36 < 0\)

2015 Paper 3 Q13
D: 1700.0 B: 1500.0

Each of the two independent random variables \(X\) and \(Y\) is uniformly distributed on the interval~\([0,1]\).

  1. By considering the lines \(x+y =\) \(\mathrm{constant}\) in the \(x\)-\(y\) plane, find the cumulative distribution function of \(X+Y\).
  2. Hence show that the probability density function \(f\) of \((X+Y)^{-1}\) is given by \[ \f(t) = \begin{cases} 2t^{-2} -t^{-3} & \text{for \( \tfrac12 \le t \le 1\)} \\ t^{-3} & \text{for \(1\le t <\infty\)}\\ 0 & \text{otherwise}. \end{cases} \] Evaluate \(\E\Big(\dfrac1{X+Y}\Big)\,\).
  3. Find the cumulative distribution function of \(Y/X\) and use this result to find the probability density function of \(\dfrac X {X+Y}\). Write down \(\E\Big( \dfrac X {X+Y}\Big)\) and verify your result by integration.


Solution:

  1. \(\mathbb{P}(X + Y \leq c) \) is the area between the \(x\)-axis, \(y\)-axis and the line \(x + y = c\). There are two cases for this: \[\mathbb{P}(X + Y \leq c) = \begin{cases} 0 & \text{ if } c \leq 0 \\ \frac{c^2}{2} & \text{ if } c \leq 1 \\ 1- \frac{(2-c)^2}{2} & \text{ if } 1 \leq c \leq 2 \\ 1 & \text{ otherwise} \end{cases}\]
  2. \begin{align*} && \mathbb{P}((X + Y)^{-1} \leq t) &= 1- \mathbb{P}(X + Y \leq \frac1{t}) \\ \Rightarrow && f_{(X+Y)^{-1}}(t) &= 0 -\begin{cases} 0 & \text{ if } \frac1{t} \leq 0 \\ \frac{\d}{\d t}\frac{1}{2t^2} & \text{ if } \frac{1}{t} \leq 1 \\ \frac{\d}{\d t} \l 1- \frac{(2-\frac1t)^2}{2} \r & \text{ if } 1 \leq \frac{1}{t} \leq 2 \\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ (2-\frac1t)t^{-2} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \\ && &= \begin{cases} t^{-3} & \text{ if } t \geq 1 \\ 2t^{-2}-t^{-3} & \text{ if } \frac12 \leq t \leq 1\\ 0 & \text{ otherwise}\end{cases} \end{align*} Therefore, \begin{align*} \E \Big(\dfrac1{X+Y}\Big) &= \int_{\frac12}^{\infty} t f_{(X+Y)^{-1}}(t) \, \d t \\ &= \int_{\frac12}^{1} t f_{(X+Y)^{-1}}(t) \, \d t + \int_{1}^{\infty} t f_{(X+Y)^{-1}}(t) \d t\\ &= \int_{\frac12}^{1} \l 2t^{-1} - t^{-2} \r \, \d t + \int_{1}^{\infty} t^{-2} \d t\\ &= \left [ 2 \ln (t) + t^{-1} \right]_{\frac12}^{1} + \left [ -t^{-1} \right ]_{1}^{\infty} \\ &= 1 + 2 \ln 2 -2 + 1 \\ &= 2 \ln 2 \end{align*}
  3. \begin{align*} &&\mathbb{P} \l \frac{Y}{X} \leq c \r &= \mathbb{P}( Y \leq c X) \\ &&&= \begin{cases} 0 & \text{if } c \leq 0 \\ \frac{c}{2} & \text{if } 0 \leq c \leq 1 \\ 1-\frac{1}{2c} & \text{if } 1 \leq c \end{cases} \\ \\ \Rightarrow && \mathbb{P} \l \frac{X}{X+Y} \leq t\r &= \mathbb{P} \l \frac{1}{1+\frac{Y}{X}} \leq t\r \\ &&&= \mathbb{P} \l \frac{1}{t} \leq 1+\frac{Y}{X}\r \\ &&&= \mathbb{P} \l \frac{1}{t} - 1\leq \frac{Y}{X}\r \\ &&&= 1- \mathbb{P} \l \frac{Y}{X} \leq \frac{1}{t} - 1\r \\ &&&= 1 - \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t} - \frac{1}{2} & \text{if } 0 \leq \frac1{t} \leq 1 \\ 1-\frac{t}{2-2t} & \text{if } 1 \leq \frac1{t} \end{cases} \\ && f_{\frac{X}{X+Y}}(t) &= \begin{cases} 0 & \text{if } \frac1{t} \leq 0 \\ \frac{1}{2t^2} & \text{if } t \geq 1 \\ \frac{1}{2(1-t)^2} & \text{if } 0 \leq t \leq 1 \end{cases} \\ \Rightarrow && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^\infty t f(t) \d t \\ &&&= \int_0^1 \frac{1}{2(1-t)^2} \d t + \int_1^\infty \frac{1}{t^2} \d t \\ &&& = \frac{1}{4} + \frac{1}{4} = \frac{1}{2} \\ \\ && \mathbb{E} \l \frac{X}{X+Y} \r &= \int_0^1 \int_0^1 \frac{x}{x+y} \d y\d x \\ &&&= \int_0^1 \l x \ln (x+1) - x \ln x \r \d x \\ &&&= \left [\frac{x^2}2 \ln(x+1) - \frac{x^2}{2} \ln(x) \right]_0^1 -\int_0^1 \l \frac{x^2}{2(x+1)} - \frac{x}{2} \r \d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x^2-1+1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \int_0^1 \frac{x -1}{2} + \frac{1}{2(x+1)}\d x \\ &&&= \frac{\ln 2}{2} + \frac{1}{4} - \frac{1}{4} + \frac{1}{2} - \frac{\ln 2}{2} \\ &&&= \frac{1}{2} \end{align*} We can also notice that \(1 = \mathbb{E} \l \frac{X+Y}{X+Y} \r = \mathbb{E} \l \frac{X}{X+Y} \r + \mathbb{E} \l \frac{Y}{X+Y} \r = 2 \mathbb{E} \l \frac{X}{X+Y} \r\) so it's clearly true as long as we can show that the integral converges.

2013 Paper 3 Q12
D: 1700.0 B: 1500.0

A list consists only of letters \(A\) and \(B\) arranged in a row. In the list, there are \(a\) letter \(A\)s and \(b\) letter \(B\)s, where \(a\ge2\) and \(b\ge2\), and \(a+b=n\). Each possible ordering of the letters is equally probable. The random variable \(X_1\) is defined by \[ X_1 = \begin{cases} 1 & \text{if the first letter in the row is \(A\)};\\ 0 & \text{otherwise.} \end{cases} \] The random variables \(X_k\) (\(2 \le k \le n\)) are defined by \[ X_k = \begin{cases} 1 & \text{if the \((k-1)\)th letter is \(B\) and the \(k\)th is \(A\)};\\ 0 & \text{otherwise.} \end{cases} \] The random variable \(S\) is defined by \(S = \sum\limits_ {i=1}^n X_i\,\).

  1. Find expressions for \(\E(X_i)\), distinguishing between the cases \(i=1\) and \(i\ne1\), and show that \(\E(S)= \dfrac{a(b+1)}n\,\).
  2. Show that:
    1. for \(j\ge3\), \(\E(X_1X_j) = \dfrac{a(a-1)b}{n(n-1)(n-2)}\,\);
    2. \[ \sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n \E(X_iX_j)\bigg) = \dfrac{a(a-1)b(b-1)}{2n(n-1)}\,\]
    3. \(\var(S) = \dfrac {a(a-1)b(b+1)}{n^2(n-1)}\,\).


Solution:

  1. Notice that \(\E[X_1] = \frac{a}{n}\) and consider \(\E[X_i]\) with \(i > 1\). the probability that this is \(1\) is \(\frac{b}{n} \cdot \frac{a}{n-1}\). So \begin{align*} && \E[S] &= \E[X_1] + \sum_{i=2}^n \E[X_i] \\ &&&= \frac{a}{n} + (n-1) \frac{ab}{n(n-1)} \\ &&&= \frac{a(b+1)}{n} \end{align*}
    1. The probability \(X_1X_j = 1\) is \(\frac{a}{n} \cdot \frac{b}{n-1} \cdot \frac{a-1}{n-2} = \frac{a(a-1)b}{n(n-1)(n-2)}\) since there is nothing special about the order, and the first is an \(A\) with probability \(\frac{a}{n}\) and given this occurs there are now \(a-1\) \(A\) and \(n-1\) letters left etc... Therefore \(\E[X_1X_j] = \frac{a(a-1)b}{n(n-1)(n-2)}\)
    2. \(\E[X_iX_j]\) when the pairs don't overlap is \(\frac{a}{n} \frac{b}{n-1} \frac{a-1}{n-2} \frac{b-1}{n-3}\), and so \begin{align*} && \sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n \E(X_iX_j)\bigg) &= \sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)}\bigg) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)}\sum\limits_{i=2}^{n-2} \bigg( \sum\limits_{j=i+2}^n 1\bigg) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)}\sum\limits_{i=2}^{n-2} (n-(i+1)) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)} \left ((n-1)(n-3)-\frac{(n-2)(n-1)}{2}+1 \right) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)} \left ( \frac{2n^2-8n-6-n^2+3n-2+2}{2}\right) \\ &&&= \frac{a(a-1)b(b-1)}{n(n-1)(n-2)(n-3)} \left ( \frac{n^2-5n-6}{2}\right) \\ &&&= \frac{a(a-1)b(b-1)}{2n(n-1)} \end{align*}
    3. We also need to consider the other cross terms. \(X_iX_{i+1}=0\). (Since \(X_i = 1\) means the \(i\)th letter is \(A\) and \(X_{i+1} = 1\) means the \(i\)th letter is \(B\)). It's the same story for \(X_1X_2\), and so all the cross terms are accounted for. Therefore \begin{align*} && \E[S^2] &= \E \left [\sum X_i^2 + 2\sum_{i \neq j} X_i X_j \right] \\ &&&= \frac{a(b+1)}{n} +2(n-2)\frac{a(a-1)b}{n(n-1)(n-2)}+ 2 \frac{a(a-1)b(b-1)}{2n(n-1)} \\ &&&= \frac{a(b+1)}{n} +\frac{2a(a-1)b}{n(n-1)} + \frac{a(a-1)b(b-1)}{n(n-1)} \\ &&&= \frac{a(b+1)}{n} +\frac{a(a-1)b(b+1)}{n(n-1)} \\ && \var[S] &= \E[S^2] - \left ( \E[S] \right)^2 \\ &&&= \frac{a(b+1)}{n} + \frac{a(a-1)b(b+1)}{n(n-1)} - \frac{a^2(b+1)^2}{n^2} \\ &&&= \frac{a(b+1) \left (n(n-1) + (a-1)b n -a(b+1)(n-1) \right)}{n^2(n-1)} \\ &&&= \frac{a(b+1) \left ( (n-a)(n-b-1) \right)}{n^2(n-1)} \\ &&&= \frac{a(b+1) \left ( b(a-1) \right)}{n^2(n-1)} \\ \end{align*}

2010 Paper 3 Q13
D: 1700.0 B: 1516.0

In this question, \({\rm Corr}(U,V)\) denotes the product moment correlation coefficient between the random variables \(U\) and \(V\), defined by \[ \mathrm{Corr}(U,V) \equiv \frac{\mathrm{Cov}(U,V)}{\sqrt{\var(U)\var(V)}}\,. \] The independent random variables \(Z_1\), \(Z_2\) and \(Z_3\) each have expectation 0 and variance 1. What is the value of \(\mathrm{Corr} (Z_1,Z_2)\)? Let \(Y_1 = Z_1\) and let \[ Y_2 = \rho _{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2\,, \] where \(\rho_{12}\) is a given constant with $-1<\rho _{12}<1$. Find \(\E(Y_2)\), \(\var(Y_2)\) and \(\mathrm{Corr}(Y_1, Y_2)\). Now let \(Y_3 = aZ_1 + bZ_2 + cZ_3\), where \(a\), \(b\) and \(c\) are real constants and \(c\ge0\). Given that \(\E(Y_3) = 0\), \(\var(Y_3) = 1\), \( \mathrm{Corr}(Y_1, Y_3) =\rho^{{2}}_{13} \) and \( \mathrm{Corr}(Y_2, Y_3)= \rho^{{2}} _{23}\), express \(a\), \(b\) and \(c\) in terms of \(\rho^{2} _{23}\), \(\rho^{2}_{13}\) and \(\rho^{2} _{12}\). Given constants \(\mu_i\) and \(\sigma_i\), for \(i=1\), \(2\) and \(3\), give expressions in terms of the \(Y_i\) for random variables \(X_i\) such that \(\E(X_i) = \mu_i\), \(\var(X_i) = \sigma_ i^2\) and \(\mathrm{Corr}(X_i,X_j) = \rho_{ij}\).


Solution: \begin{align*} \mathrm{Corr} (Z_1,Z_2) &= \frac{\mathrm{Cov}(Z_1,Z_2)}{\sqrt{\var(Z_1)\var(Z_2)}} \\ &= \frac{\mathbb{E}(Z_1 Z_2)}{\sqrt{1 \cdot 1}} \\ &= \frac{\mathbb{E}(Z_1)\mathbb{E}(Z_2)}{\sqrt{1 \cdot 1}} \\ &= \frac{0}{1} \\ &= 0 \end{align*} \begin{align*} && \mathbb{E}(Y_2) &= \mathbb{E}(\rho_{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \mathbb{E}(\rho_{12} Z_1) + \mathbb{E}( (1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \rho_{12}\mathbb{E}( Z_1) + (1 - {\rho_{12}^2})^{ \frac12}\mathbb{E}( Z_ 2) \\ &&&= 0\\ \\ && \textrm{Var}(Y_2) &= \textrm{Var}(\rho _{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \textrm{Var}(\rho_{12} Z_1)+\textrm{Cov}(\rho_{12} Z_1,(1 - {\rho_{12}^2})^{ \frac12} Z_ 2 ) + \textrm{Var}((1 - {\rho_{12}^2})^{ \frac12} Z_ 2) \\ &&&= \rho_{12}^2\textrm{Var}( Z_1)+\rho_{12} (1 - {\rho_{12}^2})^{ \frac12} \textrm{Cov}(Z_1, Z_ 2 ) + (1 - {\rho_{12}^2})\textrm{Var}(Z_ 2) \\ &&&= \rho_{12}^2 + (1-\rho_{12}^2) = 1 \\ \\ && \textrm{Cov}(Y_1, Y_2) &= \mathbb{E}((Y_1-0)(Y_2-0)) \\ &&&= \mathbb{E}(Z_1 \cdot (\rho _{12} Z_1 + (1 - {\rho_{12}^2})^{ \frac12} Z_ 2)) \\ &&&= \rho_{12} \mathbb{E}(Z_1^2) + (1-\rho_{12}^2)^{\frac12}\mathbb{E}(Z_1, Z_2) \\ &&&= \rho_{12} \\ \Rightarrow && \textrm{Corr}(Y_1, Y_2) &= \frac{\textrm{Cov}(Y_1, Y_2)}{\sqrt{\textrm{Var}(Y_1)\textrm{Var}(Y_2)}} \\ &&&= \frac{\rho_{12}}{1 \cdot 1} = \rho_{12} \end{align*} Suppose \(Y_3 =aZ_1 +bZ_2+cZ_3\) with \(\mathbb{E}(Y_3) = 0\) (must be true), \(\textrm{Var}(Y_3) = 1 = a^2+b^2+c^2\) and \(\textrm{Corr}(Y_1, Y_3) = \rho_{13}, \textrm{Corr}(Y_2, Y_3) = \rho_{23}\). \begin{align*} && \textrm{Corr}(Y_1,Y_3) &= \textrm{Cov}(Y_1, Y_3) \\ &&&= \textrm{Cov}(Z_1, aZ_1 +bZ_2+cZ_3) \\ &&&= a \\ \Rightarrow && a &= \rho_{13} \\ \\ && \textrm{Corr}(Y_2,Y_3) &= \textrm{Cov}(Y_2, Y_3) \\ &&&= \textrm{Cov}(\rho_{12}Z_1+(1-\rho_{12}^2)^\frac12Z_2, \rho_{13}Z_1 +bZ_2+cZ_3) \\ &&&= \rho_{12}\rho_{13}+(1-\rho_{12}^2)^\frac12b \\ \Rightarrow && \rho_{23} &= \rho_{12}\rho_{13}+(1-\rho_{12}^2)^\frac12b \\ \Rightarrow && b &= \frac{\rho_{23}-\rho_{12}\rho_{13}}{(1-\rho_{12}^2)^\frac12} \\ && c &= \sqrt{1-\rho_{13}^2-\frac{(\rho_{23}-\rho_{12}\rho_{13})^2}{(1-\rho_{12}^2)}} \end{align*} Finally, let \(X_i = \mu_i + \sigma_i Y_i\)

2007 Paper 3 Q12
D: 1700.0 B: 1487.4

I choose a number from the integers \(1, 2, \ldots, (2n-1)\) and the outcome is the random variable \(N\). Calculate \( \E(N)\) and \(\E(N^2)\). I then repeat a certain experiment \(N\) times, the outcome of the \(i\)th experiment being the random variable \(X_i\) (\(1\le i \le N\)). For each \(i\), the random variable \(X_i\) has mean \(\mu\) and variance \(\sigma^2\), and \(X_i\) is independent of \(X_j\) for \(i\ne j\) and also independent of \(N\). The random variable \(Y\) is defined by \(Y= \sum\limits_{i=1}^NX_i\). Show that \(\E(Y)=n\mu\) and that \(\mathrm{Cov}(Y,N) = \frac13n(n-1)\mu\). Find \(\var(Y) \) in terms of \(n\), \(\sigma^2\) and \(\mu\).


Solution: \begin{align*} && \E[N] &= \sum_{i=1}^{2n-1} \frac{i}{2n-1} \\ &&&= \frac{2n(2n-1)}{2(2n-1)} = n\\ && \E[N^2] &= \sum_{i=1}^{2n-1} \frac{i^2}{2n-1} \\ &&&= \frac{(2n-1)(2n)(4n-1)}{6(2n-1)} \\ &&&= \frac{n(4n-1)}{3} \\ && \var[N] &= \frac{n(4n-1)}{3} - n^2 \\ &&&= \frac{n^2-n}{3} \end{align*} \begin{align*} && \E[Y] &= \E \left [ \E \left [ \sum_{i=1}^N X_i | N = k\right] \right]\\ &&&= \E \left[ N\mu \right] = n\mu \\ \\ && \mathrm{Cov}(Y,N) &= \mathbb{E}[XY] - \E[X]\E[Y] \\ &&&= \E \left [ \E \left [N \sum_{i=1}^N X_i | N = k\right] \right] - n^2 \mu \\ &&&= \E[N^2\mu] - n^2 \mu \\ &&&= \left ( \frac{n^2(4n-1)}{3} - n^2 \right) \mu \\ &&&= \frac{n^2-n}{3}\mu \\ \\ && \E[Y^2] &= \E \left [ \E \left [ \left ( \sum_{i=1}^N X_i \right) ^2\right ] \right] \\ &&&= \E \left [ \E \left [ \sum_{i=1}^N X_i ^2 + 2\sum_{i,j} X_iX_j\right ] \right] \\ &&&= \E \left [ \sum_{i=1}^N \left ( \E[X_i ^2] + 2\sum_{i,j} \E[X_i]\E[X_j]\right ) \right] \\ &&&= \E \left [ N(\sigma^2 + \mu^2) + (N^2-N)\mu^2\right] \\ &&&= n(\sigma^2+\mu^2) + \left ( \frac{n^2-n}{3}-n \right)\mu^2 \\ &&&= n\sigma^2 + \frac{n^2-n}{3} \mu^2 \\ \Rightarrow && \var[Y] &= n\sigma^2 + \frac{n^2-n}{3} \mu^2 - n^2\mu^2 \\ &&&= n\sigma^2 - \frac{2n^2+n}{3} \mu^2 \end{align*}

2006 Paper 3 Q14
D: 1700.0 B: 1516.0

For any random variables \(X_1\) and \(X_2\), state the relationship between \(\E(aX_1+bX_2)\) and \(\E(X_1)\) and \(\E(X_2)\), where \(a\) and \(b\) are constants. If \(X_1\) and \(X_2\) are independent, state the relationship between \(\E(X_1X_2)\) and \(\E(X_1)\) and \(\E(X_2)\). An industrial process produces rectangular plates. The length and the breadth of the plates are modelled by independent random variables \(X_1\) and \(X_2\) with non-zero means \(\mu_1\) and \(\mu_2\) and non-zero standard deviations \(\sigma_1\) and \(\sigma_2\), respectively. Using the results in the paragraph above, and without quoting a formula for \(\var(aX_1+bX_2)\), find the means and standard deviations of the perimeter \(P\) and area \(A\) of the plates. Show that \(P\) and \(A\) are not independent. The random variable \(Z\) is defined by \(Z=P-\alpha A\), where \(\alpha \) is a constant. Show that \(Z\) and \(A\) are not independent if \[ \alpha \ne \dfrac{2(\mu_1^{\vphantom2} \sigma_2^2 +\mu_2^{\vphantom2}\sigma_1^2)} { \mu_1^2 \sigma_2^2 +\mu_2^2\sigma_1^2 + \sigma_1^2\sigma_2^2 } \;. \] Given that \(X_1\) and \(X_2\) can each take values 1 and 3 only, and that they each take these values with probability \(\frac 12\), show that \(Z\) and \(A\) are not independent for any value of \(\alpha\).


Solution: \(\E(aX_1+bX_2) = a \E(X_1) + b\E(X_2)\) for any \(X_1, X_2\) \(\E(X_1X_2)=\E(X_1)\E(X_2)\). if \(X_1, X_2\) are independent. \begin{align*} && \E(P) &= \E(2(X_1+X_2)) = 2(\E[X_1]+\E[X_2]) \\ &&&= 2(\mu_1 + \mu_2) \\ && \var(P) &= \E[\left ( 2(X_1+X_2) \right)^2] - \E[2(X_1+X_2)]^2 \\ &&&= 4\E[X_1^2+2X_1X_2+X_2^2] -4(\mu_1 + \mu_2)^2 \\ &&&= 4(\mu_1^2 + \sigma_1^2 + 2\mu_1\mu_2 + \mu_2^2 + \sigma_2^2) - 4(\mu_1 + \mu_2)^2 \\ &&&= 4(\sigma_1^2+\sigma_2^2) \\ && \textrm{SD}(P) &= 2 \sqrt{\sigma_1^2+\sigma_2^2}\\ \\ && \E(A) &= \E[X_1X_2] = \E[X_1]\E[X_2] \\ &&&= \mu_1\mu_2 \\ && \var(A) &= \E[(X_1X_2)^2] - (\mu_1\mu_2)^2 \\ &&&= (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) - (\mu_1\mu_2)^2\\ &&&= \mu_1^2 \sigma_2^2 + \mu_2^2 \sigma_1^2 + \sigma_1^2 \sigma_2^2\\ && \textrm{SD}(A) &= \sqrt{\mu_1^2 \sigma_2^2 + \mu_2^2 \sigma_1^2 + \sigma_1^2 \sigma_2^2} \end{align*} \begin{align*} \E[PA] &= \E[2(X_1+X_2)X_1X_2] \\ &= 2\E[X_1^2X_2] + 2\E[X_1X_2^2]\\ &= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2)\\ &\neq 2(\mu_1 + \mu_2)\mu_1\mu_2 \\ &= \E[P]\E[A] \end{align*} \begin{align*} && \E[Z] &= \E[P] - \alpha \E[A] \\ &&&= 2(\mu_1+\mu_2) - \alpha \mu_1 \mu_2 \\ \\ && \E[ZA] &= \E[PA - \alpha A^2] \\ &&&= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha \E[A^2] \\ &&&= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha \E[X_1^2]\E[X_2^2] \\ &&&= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) \\ \text{if ind.} && \E[Z]\E[A] &= \E[ZA]\\ && (2(\mu_1+\mu_2) - \alpha \mu_1 \mu_2) \mu_1\mu_2 &= 2(\mu_1^2 + \sigma_1^2)\mu_2 + 2\mu_1 (\mu_2^2+\sigma_2^2) - \alpha (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) \\ \Rightarrow && 2(\mu_1^2\mu_2+\mu_1\mu_2^2) - \alpha \mu_1^2\mu_2^2 &= 2(\mu_1^2\mu_2+\mu_1\mu_2^2) + 2\sigma_1^2\mu_2 + 2\sigma_2^2\mu_1 - \alpha (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) \\ \Rightarrow && \alpha ((\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) - \mu_1^2\mu_2^2) &= 2(\sigma_1^2\mu_2 + \sigma_2^2\mu_1) \\ \Rightarrow && \alpha &= \frac{ 2(\sigma_1^2\mu_2 + \sigma_2^2\mu_1) }{\mu_1^2 \sigma_2^2 + \mu_2^2 \sigma_1^2 + \sigma_1^2 \sigma_2^2} \end{align*} Therefore if they are not independent if \(\alpha \neq \) the expression. \begin{array}{c|c|c|c|c|c} & X_1 & X_2 & A & P & Z \\ \hline 0.25 & 1 & 1 & 1 & 4 & 4-\alpha \\ 0.25 & 1 & 3 & 3 & 8 & 8-3\alpha \\ 0.25 & 3 & 1 & 3 & 8 & 8-3\alpha \\ 0.25 & 3 & 3 & 9 & 12 & 12-9\alpha \\ \end{array} If \(\mathbb{P}(A = 1, Z = 4-\alpha) = \mathbb{P}(A = 1)\mathbb{P}(Z = 4-\alpha)\) then \(\mathbb{P}(Z = 4-\alpha) = 1\), but that mean \(4-\alpha = 8-3\alpha = 12-9\alpha\) which is not a consistent set of equations as the first two are solved by \(\alpha = 2\) and the second by \(\alpha = \frac23\)

2005 Paper 3 Q12
D: 1700.0 B: 1516.0

Five independent timers time a runner as she runs four laps of a track. Four of the timers measure the individual lap times, the results of the measurements being the random variables \(T_1\) to \(T_4\), each of which has variance \(\sigma^2\) and expectation equal to the true time for the lap. The fifth timer measures the total time for the race, the result of the measurement being the random variable \(T\) which has variance \(\sigma^2\) and expectation equal to the true race time (which is equal to the sum of the four true lap times). Find a random variable \(X\) of the form \(aT+b(T_1+T_2+T_3+T_4)\), where \(a\) and \(b\) are constants independent of the true lap times, with the two properties:

  1. whatever the true lap times, the expectation of \(X\) is equal to the true race time;
  2. the variance of \(X\) is as small as possible.
Find also a random variable \(Y\) of the form \(cT+d(T_1+T_2+T_3+T_4)\), where \(c\) and \(d\) are constants independent of the true lap times, with the property that, whatever the true lap times, the expectation of \(Y^2\) is equal to \(\sigma^2\). In one particular race, \(T\) takes the value 220 seconds and \((T_1 + T_2 + T_3 + T_4)\) takes the value \(220.5\) seconds. Use the random variables \(X\) and \(Y\) to estimate an interval in which the true race time lies.


Solution: Let the expected total time for the race be \(\mu\). Let \(X = aT + b(T_1 + T_2+T_3+T_4)\) then \(\E[X] = a\E[T] + b\E[T_1+\cdots+T_4] = a \mu + b \mu = (a+b)\mu\). So \(a+b=1\). \begin{align*} && \var[X] &= a^2\var[T] + b^2(\var[T_1] + \var[T_2] + \var[T_3] + \var[T_4]) \\ &&&= a^2\sigma^2 + 4b^2 \sigma^2 \\ &&& = \sigma^2 (a^2 + 4(1-a)^2 ) \\ &&&= \sigma^2 (5a^2 - 8a + 4) \\ &&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 - \frac{16}{5}+4 \right)\\ &&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 + \frac{4}{5}\right) \end{align*} Therefore variance is minimised when \(a = \frac45, b = \frac15\). Let \(Y = cT + d(T_1 + T_2+T_3+T_4)\) then \begin{align*} && \E[Y^2] &= \E \left [c^2T^2 + 2cd T(T_1+T_2+T_3+T_4) + d^2(T_1+T_2+T_3+T_4)^2 \right] \\ &&&= c^2 (\mu^2 + \sigma^2) + 2cd \mu^2 + d^2 (\var[T_1 + \cdots + T_4] + \mu^2) \\ &&&= c^2(\mu^2+\sigma^2) + 2cd \mu^2 + d^2(4\sigma^2 + \mu^2) \\ &&&= (c^2 + 2cd + d^2) \mu^2 + (c^2+4d^2) \sigma^2 \\ &&&= (c+d)^2 \mu^2 + (c^2+4d^2) \sigma^2 \\ \\ \Rightarrow && d &= -c \\ && 1 &= c^2 + 4d^2 \\ \Rightarrow && c &= \pm \frac{1}{\sqrt5} \\ && d &= \mp \frac{1}{\sqrt5} \end{align*} Given our results, our best estimate for \(\mu\) is \(\frac45 \cdot 220 + \frac15 220.5 = 220.1\). Our estimate for \(\sigma^2 = \left( \frac{1}{\sqrt{5}}(220.5-220) \right)^2 = \frac{1}{20}\). Note that \(\var[X] = \frac45\sigma^2 \approx \frac{1}{25}\) so we are looking at an interval \((220.1 - 0.4, 220.1 + 0.4) = (219.7, 220.5)\) using an interval of two standard errors.

2002 Paper 3 Q14
D: 1700.0 B: 1500.0

Prove that, for any two discrete random variables \(X\) and \(Y\), \[ \mathrm{Var} \left(X + Y \right) = \mathrm{Var}(X) + \mathrm{Var}(Y) + 2 \, \mathrm{Cov}(X,Y), \] where \(\mathrm{Var}(X)\) is the variance of \(X\) and \(\mathrm{Cov}(X,Y)\) is the covariance of \(X\) and \(Y\). When a Grandmaster plays a sequence of \(m\) games of chess, she is, independently, equally likely to win, lose or draw each game. If the values of the random variables \(W\), \(L\) and \(D\) are the numbers of her wins, losses and draws respectively, justify briefly the following claims:

  1. \(W + L + D\) has variance \(0\,\);
  2. \(W + L\) has a binomial distribution.
Find the value of \(\displaystyle {\mathrm{Cov}(W,L) \over \sqrt{\mathrm{Var}(W) \mathrm{Var}(L)}}\;\).


Solution: \begin{align*} && \var[X+Y] &= \E\left [(X+Y-\E[X+Y])^2 \right] \\ &&&= \E \left [ (X - \E[X] + Y - \E[Y])^2 \right] \\ &&&= \E \left [(X - \E[X])^2 + (Y-\E[Y])^2 + 2(X-\E[X])(Y-\E[Y]) \right] \\ &&&= \E \left [(X - \E[X])^2 \right]+\E \left [(Y-\E[Y])^2 \right]+\E \left [2(X-\E[X])(Y-\E[Y]) \right] \\ &&&= \var[X] + \var[Y] + 2 \mathrm{Cov}(X,Y) \end{align*}

  1. \(W+L+D = m\) where \(m\) is the number of games, which has variance \(0\). Therefore \(W+L+D\) has variance \(0\).
  2. The probability of a decisive game is \(\frac23\) and \(W+L\) is the number of decisive games. Each game is independent so this meets the criteria for a binomial distribution.
Notice \(W+L \sim B(m, \tfrac23)\) and \(W, L, D \sim B(m, \tfrac13)\), in particular \(\var[W+L] = m \tfrac23 \tfrac13 = \tfrac29m\) and \(\var[W] = \var[D] = \var[D] = m \tfrac13 \tfrac13 = \tfrac29m\) \begin{align*} && \var[W+L] &= \var[W] + \var[L] + 2\mathrm{Cov}(W,L) \\ \Rightarrow && \mathrm{Cov}(W,L) &= -\tfrac19m \\ \Rightarrow && \frac{\mathrm{Cov}(W,L) }{\sqrt{\var[W]\var[L]}} &= -\frac12 \end{align*}

2000 Paper 3 Q14
D: 1700.0 B: 1500.0

The random variable \(X\) takes only the values \(x_1\) and \(x_2\) (where \( x_1 \not= x_2 \)), and the random variable \(Y\) takes only the values \(y_1\) and \(y_2\) (where \(y_1 \not= y_2\)). Their joint distribution is given by $$ \P ( X = x_1 , Y = y_1 ) = a \ ; \ \ \P ( X = x_1 , Y = y_2 ) = q - a \ ; \ \ \P ( X = x_2 , Y = y_1 ) = p - a \ . $$ Show that if \(\E(X Y) = \E(X)\E(Y)\) then $$ (a - p q ) ( x_1 - x_2 ) ( y_1 - y_2 ) = 0 . $$ Hence show that two random variables each taking only two distinct values are independent if \(\E(X Y) = \E(X) \E(Y)\). Give a joint distribution for two random variables \(A\) and \(B\), each taking the three values \(- 1\), \(0\) and \(1\) with probability \({1 \over 3}\), which have \(\E(A B) = \E( A)\E (B)\), but which are not independent.


Solution: \begin{align*} \mathbb{P}(X = x_1) &= a + q - a = q \\ \mathbb{P}(X = x_2) &= 1 - q \\ \mathbb{P}(Y = y_1) & = a + p - a = p \\ \mathbb{P}(Y = y_2) & = 1 - p \end{align*} \begin{align*} \mathbb{E}(X)\mathbb{E}(Y) &= \l qx_1 + (1-q)x_2 \r \l p y_1 + (1-p)y_2\r \\ &= qpx_1y_1 + q(1-p)x_1y_2 + (1-q)px_2y_1 + (1-q)(1-p)x_2y_2 \\ \mathbb{E}(XY) &= ax_1y_1 + (q-a)x_1y_2 + (p-a)x_2y_1 + (1 + a - p - q)x_2y_2 &= \end{align*} Therefore \(\mathbb{E}(XY) - \mathbb{E}(X)\mathbb{E}(Y)\) is a degree 2 polynomial in the \(x_i, y_i\). If \(x_1 = x_2\) then we have: \begin{align*} \mathbb{E}(X)\mathbb{E}(Y) &=x_1 \l p y_1 + (1-p)y_2\r \\ \mathbb{E}(XY) &= x_1(ay_1 + (q-a)y_2 + (p-a)y_1 + (1 + a - p - q)y_2) \\ &= x_1 (py_1 + (1-p)y_2) \end{align*} Therefore \(x_1 - x_2\) is a root and by symmetry \(y_1 - y_2\) is a root. Therefore it remains to check the coefficient of \(x_1y_1\) which is \(a - pq\) to complete the factorisation. For any two random variables taking two distinct values, we can find \(a, q, p\) satisfying the relations above. We also note that \(X\) and \(Y\) are independent if \(\mathbb{P}(X = x_i, Y = y_i) = \mathbb{P}(X = x_i)\mathbb{P}(Y = y_i)\). Since \(x_1 \neq x_2\) and \(y_1 \neq y_2\) and \(\E(A B) = \E( A)\E (B) \Rightarrow a = pq\). But if \(a = pq\), we have \(\mathbb{P}(X = x_1, Y = y_1) = \mathbb{P}(X = x_1)\mathbb{P}(Y = y_1)\) and all the other relations drop out similarly. Consider \begin{align*} \mathbb{P}(A = -1, B = 1) &= \frac{1}{6} \\ \mathbb{P}(A = -1, B = -1) &= \frac{1}{6} \\ \mathbb{P}(A = 0, B = 0) &= \frac{1}{3} \\ \mathbb{P}(A = 1, B = -1) &= \frac{1}{6} \\ \mathbb{P}(A = -1, B = -1) &= \frac{1}{6} \end{align*}

1997 Paper 3 Q14
D: 1700.0 B: 1516.0

An industrial process produces rectangular plates of mean length \(\mu_{1}\) and mean breadth \(\mu_{2}\). The length and breadth vary independently with non-zero standard deviations \(\sigma_{1}\) and \(\sigma_{2}\) respectively. Find the means and standard deviations of the perimeter and of the area of the plates. Show that the perimeter and area are not independent.


Solution: Let \(L \sim N(\mu_1, \sigma_1^2)\), \(B \sim N(\mu_2, \sigma_2)^2\), so \begin{align*} && \mathbb{E}(\text{perimeter}) &= \E(2(L+B)) \\ &&&= 2\E[L]+2\E[B] \\ &&&= 2(\mu_1+\mu_2) \\ &&\var[\text{perimeter}] &= \E\left [ (2(L+B))^2 \right] - \left ( \E[2(L+B)] \right)^2 \\ &&&= 4\E[L^2+2LB+B^2] - 4(\mu_1+\mu_2)^2 \\ &&&= 4(\sigma_1^2+\mu_1^2+2\mu_1\mu_2+\sigma_2^2+\mu_2^2) - 4(\mu_1+\mu_2)^2\\ &&&= 4(\sigma_1^2+\sigma_2^2) \\ &&\text{sd}[\text{perimeter}] &= 2\sqrt{\sigma_1^2+\sigma_2^2} \\ \\ && \E[\text{area}] &= \E[LB] \\ &&&= \E[L]\E[B] \\ &&&= \mu_1\mu_2 \\ && \var[\text{area}] &= \E[(LB)^2] - \left (\E[LB] \right)^2 \\ &&&= \E[L^2]\E[B^2]-\mu_1^2\mu_2^2 \\ &&&= (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) -\mu_1^2\mu_2^2 \\ &&&= \sigma_1^2\mu_2^2 + \sigma_2^2\mu_1^2 + \sigma_1^2\sigma_2^2\\ && \text{sd}(\text{area}) &= \sqrt{\sigma_1^2\mu_2^2 + \sigma_2^2\mu_1^2 + \sigma_1^2\sigma_2^2} \\ \\ && \E[\text{perimeter} \cdot \text{area}] &= \E[2(L+B)LB] \\ &&&= 2\E[L^2]\E[B] + 2\E[L]\E[B^2] \\ &&&= 2(\sigma_1^2+\mu_1^2)\mu_2 + 2(\sigma_2^2+\mu_2^2)\mu_1 \\ && \E[\text{perimeter}] \E[\text{area}] &= 2(\mu_1+\mu_2) \cdot \mu_1\mu_2 \end{align*} Since the latter does not depend on \(\sigma_i\) but the former does they cannot be equal in general, therefore they cannot be independent. [See also STEP 2006 Paper 3 Q14]

1995 Paper 3 Q12
D: 1700.0 B: 1484.0

The random variables \(X\) and \(Y\) are independently normally distributed with means 0 and variances 1. Show that the joint probability density function for \((X,Y)\) is \[ \mathrm{f}(x,y)=\frac{1}{2\pi}\mathrm{e}^{-\frac{1}{2}(x^{2}+y^{2})}\qquad-\infty < x < \infty,-\infty < y < \infty. \] If \((x,y)\) are the coordinates, referred to rectangular axes, of a point in the plane, explain what is meant by saying that this density is radially symmetrical. The random variables \(U\) and \(V\) have a joint probability density function which is radially symmetrical (in the above sense). By considering the straight line with equation \(U=kV,\) or otherwise, show that \[ \mathrm{P}\left(\frac{U}{V} < k\right)=2\mathrm{P}(U < kV,V > 0). \] Hence, or otherwise, show that the probability density function of \(U/V\) is \[ \mathrm{g}(k)=\frac{1}{\pi(1+k^{2})}\qquad-\infty < k < \infty. \]

1991 Paper 3 Q16
D: 1700.0 B: 1504.3

The random variables \(X\) and \(Y\) take integer values \(x\) and \(y\) respectively which are restricted by \(x\geqslant1,\) \(y\geqslant1\) and \(2x+y\leqslant2a\) where \(a\) is an integer greater than 1. The joint probability is given by \[ \mathrm{P}(X=x,Y=y)=c(2x+y), \] where \(c\) is a positive constant, within this region and zero elsewhere. Obtain, in terms of \(x,c\) and \(a,\) the marginal probability \(\mathrm{P}(X=x)\) and show that \[ c=\frac{6}{a(a-1)(8a+5)}. \] Show that when \(y\) is an even number the marginal probability \(\mathrm{P}(Y=y)\) is \[ \frac{3(2a-y)(2a+2+y)}{2a(a-1)(8a+5)} \] and find the corresponding expression when \(y\) is off. Evaluate \(\mathrm{E}(Y)\) in terms of \(a\).