Year: 2005
Paper: 3
Question Number: 12
Course: UFM Statistics
Section: Bivariate data
Difficulty Rating: 1700.0
Difficulty Comparisons: 0
Banger Rating: 1516.0
Banger Comparisons: 1
Five independent timers time a runner as she runs four laps of a track. Four of the timers measure the individual lap times, the results of the measurements being the random variables $T_1$ to $T_4$, each of which has variance $\sigma^2$ and expectation equal to the true time for the lap. The fifth timer measures the total time for the race, the result of the measurement being the random variable $T$ which has variance $\sigma^2$ and expectation equal to the true race time (which is equal to the sum of the four true lap times).
Find a random variable $X$ of the form $aT+b(T_1+T_2+T_3+T_4)$, where $a$ and $b$ are constants independent of the true lap times, with the two properties:
\begin{enumerate}
\item whatever the true lap times, the expectation of $X$ is equal to the true race time;
\item the variance of $X$ is as small as possible.
\end{enumerate}
Find also a random variable $Y$ of the form $cT+d(T_1+T_2+T_3+T_4)$, where $c$ and $d$ are constants independent of the true lap times, with the property that, whatever the true lap times, the expectation of $Y^2$ is equal to $\sigma^2$.
In one particular race, $T$ takes the value 220 seconds and $(T_1 + T_2 + T_3 + T_4)$ takes the value $220.5$ seconds. Use the random variables $X$ and $Y$ to estimate an interval in which the true race time lies.
Let the expected total time for the race be $\mu$.
Let $X = aT + b(T_1 + T_2+T_3+T_4)$ then $\E[X] = a\E[T] + b\E[T_1+\cdots+T_4] = a \mu + b \mu = (a+b)\mu$. So $a+b=1$.
\begin{align*}
&& \var[X] &= a^2\var[T] + b^2(\var[T_1] + \var[T_2] + \var[T_3] + \var[T_4]) \\
&&&= a^2\sigma^2 + 4b^2 \sigma^2 \\
&&& = \sigma^2 (a^2 + 4(1-a)^2 ) \\
&&&= \sigma^2 (5a^2 - 8a + 4) \\
&&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 - \frac{16}{5}+4 \right)\\
&&&= \sigma^2 \left ( 5 \left ( a - \frac45 \right)^2 + \frac{4}{5}\right)
\end{align*}
Therefore variance is minimised when $a = \frac45, b = \frac15$.
Let $Y = cT + d(T_1 + T_2+T_3+T_4)$ then
\begin{align*}
&& \E[Y^2] &= \E \left [c^2T^2 + 2cd T(T_1+T_2+T_3+T_4) + d^2(T_1+T_2+T_3+T_4)^2 \right] \\
&&&= c^2 (\mu^2 + \sigma^2) + 2cd \mu^2 + d^2 (\var[T_1 + \cdots + T_4] + \mu^2) \\
&&&= c^2(\mu^2+\sigma^2) + 2cd \mu^2 + d^2(4\sigma^2 + \mu^2) \\
&&&= (c^2 + 2cd + d^2) \mu^2 + (c^2+4d^2) \sigma^2 \\
&&&= (c+d)^2 \mu^2 + (c^2+4d^2) \sigma^2 \\
\\
\Rightarrow && d &= -c \\
&& 1 &= c^2 + 4d^2 \\
\Rightarrow && c &= \pm \frac{1}{\sqrt5} \\
&& d &= \mp \frac{1}{\sqrt5}
\end{align*}
Given our results, our best estimate for $\mu$ is $\frac45 \cdot 220 + \frac15 220.5 = 220.1$.
Our estimate for $\sigma^2 = \left( \frac{1}{\sqrt{5}}(220.5-220) \right)^2 = \frac{1}{20}$.
Note that $\var[X] = \frac45\sigma^2 \approx \frac{1}{25}$ so we are looking at an interval $(220.1 - 0.4, 220.1 + 0.4) = (219.7, 220.5)$ using an interval of two standard errors.