Year: 2007
Paper: 2
Question Number: 14
Course: UFM Statistics
Section: Cumulative distribution functions
Although the paper was by no means an easy one, it was generally found a more accessible paper than last year's, with most questions clearly offering candidates an attackable starting-point. The candidature represented the usual range of mathematical talents, with a pleasingly high number of truly outstanding students; many more who were able to demonstrate a thorough grasp of the material in at least three questions; and the few whose three-hour long experience was unlikely to have been a particularly pleasant one. However, even for these candidates, many were able to make some progress on at least two of the questions chosen. Really able candidates generally produced solid attempts at five or six questions, and quite a few produced outstanding efforts at up to eight questions. In general, it would be best if centres persuaded candidates not to spend valuable time needlessly in this way – it is a practice that is not to be encouraged, as it uses valuable examination time to little or no avail. Weaker brethren were often to be found scratching around at bits and pieces of several questions, with little of substance being produced on more than a couple. It is an important examination skill – now more so than ever, with most candidates now not having to employ such a skill on the modular papers which constitute the bulk of their examination experience – for candidates to spend a few minutes at some stage of the examination deciding upon their optimal selection of questions to attempt. As a rule, question 1 is intended to be accessible to all takers, with question 2 usually similarly constructed. In the event, at least one – and usually both – of these two questions were among candidates' chosen questions. These, along with questions 3 and 6, were by far the most popularly chosen questions to attempt. The majority of candidates only attempted questions in Section A (Pure Maths), and there were relatively few attempts at the Applied Maths questions in Sections B & C, with Mechanics proving the more popular of the two options. It struck me that, generally, the working produced on the scripts this year was rather better set-out, with a greater logical coherence to it, and this certainly helps the markers identify what each candidate thinks they are doing. Sadly, this general remark doesn't apply to the working produced on the Mechanics questions, such as they were. As last year, the presentation was usually appalling, with poorly labelled diagrams, often with forces missing from them altogether, and little or no attempt to state the principles that the candidates were attempting to apply.
Difficulty Rating: 1600.0
Difficulty Comparisons: 0
Banger Rating: 1484.0
Banger Comparisons: 1
The random variable $X$ has a continuous probability density
function $\f(x)$ given by
\begin{equation*}
\f(x) =
\begin{cases}
0 & \text{for } x \le 1 \\
\ln x & \text{for } 1\le x \le k\\
\ln k & \text{for } k\le x \le 2k\\
a-bx & \text{for } 2k \le x \le 4k \\
0 & \text{for } x\ge 4k
\end{cases}
\end{equation*}
where $k$, $a$ and $b$ are constants.
\begin{questionparts}
\item Sketch the graph of $y=\f(x)$.
\item Determine $a$ and $b$ in terms of $k$ and find the numerical
values of $k$, $a$ and $b$.
\item Find the median value of $X$.
\end{questionparts}
\begin{questionparts}
\item \begin{center}
\begin{tikzpicture}[scale=2]
\draw[->] (0,0) -- (6,0) node [right] {$x$};
\draw[->] (0,0) -- (0, 1.5) node [above] {$f(x)$};
\def\k{exp(1/3)};
\draw[domain = 1:\k, samples=100, variable = \x] plot ({\x}, {3*ln(\x)});
\draw ({\k}, {3*ln(\k)}) -- ({2*\k}, {3*ln(\k)});
\draw ({4*\k}, 0) -- ({2*\k}, {3*ln(\k)});
\draw[dashed] ({2*\k}, 0) -- ({2*\k}, {3*ln(\k)});
\draw[dashed] ({1*\k}, 0) -- ({1*\k}, {3*ln(\k)});
\node at (1,0) [below] {$1$};
\node at ({\k},0) [below] {$k$};
\node at ({2*\k},0) [below] {$2k$};
\node at ({1.5*\k},{3*ln(\k)}) [above] {$\ln(k)$};
\node at ({4*\k},0) [below] {$4k$};
\end{tikzpicture}
\end{center}
\item Since $f(x)$ is continuous, $a -bx$ joins $(2k, \ln k)$ and $(4k ,0)$. ie it has a gradient of $\frac{-\ln k}{2k}$ and is zero at $4k$, hence $\displaystyle b = -\frac{\ln k}{2k}, a = 2\ln k$.
The $3$ sections have areas $\int_1^k \ln x \d x = k \ln k -k +1$, $k \ln k, k \ln k$. Therefore
\begin{align*}
&&1&= 3k\ln k - k +1 \\
\Rightarrow &&0 &= k(3\ln k - 1) \\
\Rightarrow &&\ln k &= \frac13 \\
\Rightarrow &&k &= e^{1/3} \\
&& a &= \frac23 \\
&& b&= -\frac16e^{-1/3}
\end{align*}
\item Clearly $1 > k \ln k > \frac{1}{3}$, therefore the median must lie between $k$ and $2k$.
So we need, $\frac12$ to be the area of the rectangle + the triangle, ie:
\begin{align*}
&& \frac12 &= k \ln k + (2k-M) \ln k \\
&&&= \frac13 k + \frac13 (2k - M) \\
\Rightarrow && M &= 3k - \frac32 \\
\Rightarrow && M &= 3e^{1/3} - \frac32
\end{align*}
\end{questionparts}
This was the most popular of the Statistics questions, probably due to the high pure mathematical content. The sketch introduction was intended to ensure that candidates drew something which would remind them what integrals they should be working with later on. As with Q2, it presented more problems than should have been the case, with many candidates losing marks for fairly trivial things which would have cost them dearly even on an ordinary AS/A-level module paper. The integration for total probability was generally done very well, although several candidates had often failed to gain a and b in terms of k in a simplified form, or at all, and this rather hindered them. In (iii), most candidates didn't seem to feel that it was necessary to justify which region of the function that the median lay in, often doing one calculation after making an assumption about the matter. In general, it is always best if candidates can justify their choices.