2005 Paper 3 Q12

Year: 2005
Paper: 3
Question Number: 12

Course: UFM Statistics
Section: Bivariate data

Difficulty: 1700.0 Banger: 1516.0

Problem

Five independent timers time a runner as she runs four laps of a track. Four of the timers measure the individual lap times, the results of the measurements being the random variables \(T_1\) to \(T_4\), each of which has variance \(\sigma^2\) and expectation equal to the true time for the lap. The fifth timer measures the total time for the race, the result of the measurement being the random variable \(T\) which has variance \(\sigma^2\) and expectation equal to the true race time (which is equal to the sum of the four true lap times). Find a random variable \(X\) of the form \(aT+b(T_1+T_2+T_3+T_4)\), where \(a\) and \(b\) are constants independent of the true lap times, with the two properties:
  1. whatever the true lap times, the expectation of \(X\) is equal to the true race time;
  2. the variance of \(X\) is as small as possible.
Find also a random variable \(Y\) of the form \(cT+d(T_1+T_2+T_3+T_4)\), where \(c\) and \(d\) are constants independent of the true lap times, with the property that, whatever the true lap times, the expectation of \(Y^2\) is equal to \(\sigma^2\). In one particular race, \(T\) takes the value 220 seconds and \((T_1 + T_2 + T_3 + T_4)\) takes the value \(220.5\) seconds. Use the random variables \(X\) and \(Y\) to estimate an interval in which the true race time lies.

Solution

Let the expected total time for the race be \(\mu\). Let \(X = aT + b(T_1 + T_2+T_3+T_4)\) then \(\E[X] = a\E[T] + b\E[T_1+\cdots+T_4] = a \mu + b \mu = (a+b)\mu\). So \(a+b=1\). \begin{align*} && \var[X] &= a^2\var[T] + b^2(\var[T_1] + \var[T_2] + \var[T_3] + \var[T_4]) \\ &&&= a^2\sigma^2 + 4b^2 \sigma^2 \\ &&& = \sigma^2 (a^2 + 4(1-a)^2 ) \\ &&&= \sigma^2 (5a^2 - 8a + 4) \\ &&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 - \frac{16}{5}+4 \right)\\ &&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 + \frac{4}{5}\right) \end{align*} Therefore variance is minimised when \(a = \frac45, b = \frac15\). Let \(Y = cT + d(T_1 + T_2+T_3+T_4)\) then \begin{align*} && \E[Y^2] &= \E \left [c^2T^2 + 2cd T(T_1+T_2+T_3+T_4) + d^2(T_1+T_2+T_3+T_4)^2 \right] \\ &&&= c^2 (\mu^2 + \sigma^2) + 2cd \mu^2 + d^2 (\var[T_1 + \cdots + T_4] + \mu^2) \\ &&&= c^2(\mu^2+\sigma^2) + 2cd \mu^2 + d^2(4\sigma^2 + \mu^2) \\ &&&= (c^2 + 2cd + d^2) \mu^2 + (c^2+4d^2) \sigma^2 \\ &&&= (c+d)^2 \mu^2 + (c^2+4d^2) \sigma^2 \\ \\ \Rightarrow && d &= -c \\ && 1 &= c^2 + 4d^2 \\ \Rightarrow && c &= \pm \frac{1}{\sqrt5} \\ && d &= \mp \frac{1}{\sqrt5} \end{align*} Given our results, our best estimate for \(\mu\) is \(\frac45 \cdot 220 + \frac15 220.5 = 220.1\). Our estimate for \(\sigma^2 = \left( \frac{1}{\sqrt{5}}(220.5-220) \right)^2 = \frac{1}{20}\). Note that \(\var[X] = \frac45\sigma^2 \approx \frac{1}{25}\) so we are looking at an interval \((220.1 - 0.4, 220.1 + 0.4) = (219.7, 220.5)\) using an interval of two standard errors.
Rating Information

Difficulty Rating: 1700.0

Difficulty Comparisons: 0

Banger Rating: 1516.0

Banger Comparisons: 1

Show LaTeX source
Problem source
Five independent timers time a runner as she runs four laps of a track. Four of the timers measure the individual lap times, the results of the measurements being the random variables $T_1$ to $T_4$, each of which has  variance $\sigma^2$ and expectation equal to the true time for the lap. The fifth timer measures the total time for the race, the result of the measurement being the random variable $T$ which has  variance $\sigma^2$ and expectation equal to the true race time (which is equal to the sum of the four true lap times).
                                                
Find a random variable $X$ of the form $aT+b(T_1+T_2+T_3+T_4)$, where $a$ and $b$ are constants independent of the true lap times, with the two properties:
\begin{enumerate}
\item whatever the true lap times, the expectation of $X$ is equal to the true race time;
\item the variance of $X$ is as small as possible.
\end{enumerate}
Find also a random variable $Y$ of the form $cT+d(T_1+T_2+T_3+T_4)$, where $c$ and $d$ are constants independent of the true lap times, with the property that, whatever the true lap times, the expectation of $Y^2$ is equal to $\sigma^2$.
In one particular race, $T$ takes the value 220 seconds and $(T_1 + T_2 + T_3 + T_4)$ takes the value $220.5$ seconds. Use the random variables $X$ and $Y$ to estimate an interval in which the true race time lies.
Solution source
Let the expected total time for the race be $\mu$.
Let $X = aT + b(T_1 + T_2+T_3+T_4)$ then $\E[X] = a\E[T] + b\E[T_1+\cdots+T_4] = a \mu + b \mu = (a+b)\mu$. So $a+b=1$.

\begin{align*}
&& \var[X] &= a^2\var[T] + b^2(\var[T_1] + \var[T_2] + \var[T_3] + \var[T_4]) \\
&&&= a^2\sigma^2 + 4b^2 \sigma^2 \\
&&& = \sigma^2 (a^2 + 4(1-a)^2 ) \\
&&&= \sigma^2 (5a^2 - 8a + 4) \\
&&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 - \frac{16}{5}+4 \right)\\
&&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 + \frac{4}{5}\right)
\end{align*}

Therefore variance is minimised when $a = \frac45, b = \frac15$.

Let $Y = cT + d(T_1 + T_2+T_3+T_4)$ then

\begin{align*}
&& \E[Y^2] &= \E \left [c^2T^2 + 2cd T(T_1+T_2+T_3+T_4) + d^2(T_1+T_2+T_3+T_4)^2 \right] \\
&&&= c^2 (\mu^2 + \sigma^2) + 2cd \mu^2 + d^2 (\var[T_1 + \cdots + T_4] + \mu^2) \\
&&&= c^2(\mu^2+\sigma^2) + 2cd \mu^2 + d^2(4\sigma^2 + \mu^2) \\
&&&= (c^2 + 2cd + d^2) \mu^2 + (c^2+4d^2) \sigma^2 \\
&&&= (c+d)^2 \mu^2 + (c^2+4d^2) \sigma^2 \\
\\
\Rightarrow && d &= -c \\
&& 1 &= c^2 + 4d^2 \\
\Rightarrow && c &= \pm \frac{1}{\sqrt5} \\
&& d &= \mp \frac{1}{\sqrt5}
\end{align*}

Given our results, our best estimate for $\mu$ is $\frac45 \cdot 220 + \frac15 220.5 = 220.1$. 
Our estimate for $\sigma^2 =  \left( \frac{1}{\sqrt{5}}(220.5-220) \right)^2 = \frac{1}{20}$. 

Note that $\var[X] = \frac45\sigma^2 \approx \frac{1}{25}$ so we are looking at an interval $(220.1 - 0.4, 220.1 + 0.4) = (219.7, 220.5)$ using an interval of two standard errors.