2014 Paper 2 Q13

Year: 2014
Paper: 2
Question Number: 13

Course: LFM Stats And Pure
Section: Uniform Distribution

Difficulty: 1600.0 Banger: 1469.5

Problem

A random number generator prints out a sequence of integers \(I_1, I_2, I_3, \dots\). Each integer is independently equally likely to be any one of \(1, 2, \dots, n\), where \(n\) is fixed. The random variable \(X\) takes the value \(r\), where \(I_r\) is the first integer which is a repeat of some earlier integer. Write down an expression for \(\mathbb{P}(X=4)\).
  1. Find an expression for \(\mathbb{P}(X=r)\), where \(2\le r\le n+1\). Hence show that, for any positive integer \(n\), \[ \frac 1n + \left(1-\frac1n\right) \frac 2 n + \left(1-\frac1n\right)\left(1-\frac2n\right) \frac3 n + \cdots \ = \ 1 \,. \]
  2. Write down an expression for \(\mathbb{E}(X)\). (You do not need to simplify it.)
  3. Write down an expression for \(\mathbb{P}(X\ge k)\).
  4. Show that, for any discrete random variable \(Y\) taking the values \(1, 2, \dots, N\), \[ \mathbb{E}(Y) = \sum_{k=1}^N \mathbb{P}(Y\ge k)\,. \] Hence show that, for any positive integer \(n\), \[ \left(1-\frac{1^2}n\right) + \left(1-\frac1n\right)\left(1-\frac{2^2}n\right) + \left(1-\frac1n\right)\left(1-\frac{2}n\right)\left(1-\frac{3^2}n\right) + \cdots \ = \ 0. \]

Solution

\begin{align*} && \mathbb{P}(X > 4) &= 1 \cdot \frac{n-1}{n} \cdot \frac{n-2}{n} \cdot \frac{n-3}{n} \\ && \mathbb{P}(X > 3) &= 1 \cdot \frac{n-1}{n} \cdot \frac{n-2}{n} \\ \Rightarrow && \mathbb{P}(X =4) &= \mathbb{P}(X > 3) - \mathbb{P}(X > 4) \\ &&&= \frac{(n-1)(n-2)}{n^2} \left (1 - \frac{n-3}{n} \right) \\ &&&= \frac{3(n-1)(n-2)}{n^3} \end{align*}
  1. Notice that \begin{align*} && \mathbb{P}(X > r) &= \frac{n-1}{n} \cdots \frac{n-r+1}{n} \\ \Rightarrow && \mathbb{P}(X = r) &= \frac{n-1}{n} \cdots \frac{n-r+2}{n} \left (1 - \frac{n-r+1}{n} \right) \\ &&&= \frac{(n-1)\cdots(n-r+2)(r-1)}{n^{r-1}} \\ &&&= \left (1 - \frac{1}n \right)\left (1 - \frac{2}{n} \right) \cdots \left (1 - \frac{r-2}{n} \right) \frac{r-1}{n} \\ \Rightarrow && 1 &= \sum \mathbb{P}(X = r) \\ &&&= \sum_{r=2}^{n+1} \mathbb{P}(X = r) \\ &&&= \frac 1n + \left(1-\frac1n\right) \frac 2 n + \left(1-\frac1n\right)\left(1-\frac2n\right) \frac3 n + \cdots \end{align*}
  2. \(\,\) \begin{align*} \mathbb{E}(X) &= \sum_{r=2}^{n+1} r\cdot\mathbb{P}(X = r) \\ &= \frac 2n + \left(1-\frac1n\right) \frac {2\cdot3} n + \left(1-\frac1n\right)\left(1-\frac2n\right) \frac{3\cdot4} n + \cdots \end{align*}
  3. \(\displaystyle \mathbb{P}(X \geq k) = \frac{n-1}{n} \cdots \frac{n-r+2}{n}\)
  4. \(\,\) \begin{align*} && \mathbb{E}(Y) &= \sum_{r=1}^N r \cdot \mathbb{P}(Y = r) \\ &&&= \sum_{r=1}^N \sum_{j=1}^r \mathbb{P}(Y = r) \\ &&&= \sum_{j=1}^N \sum_{r=j}^N \mathbb{P}(Y=r) \\ &&&= \sum_{j=1}^N \mathbb{P}(Y \geq j) \end{align*} Let \(P_k = \left(1-\frac1n\right)\left(1-\frac2n\right) \cdots \left(1-\frac1n\right)\left(1-\frac{k}n\right) \) \begin{align*} && \mathbb{E}(X) &= P_1 \frac{1 \cdot 2 }{n} + P_2 \cdot \frac{2 \cdot 3}{n} + \cdots + P_k \cdot \frac{k(k+1)}{n} + \cdots \\ && &= \sum_{k=1}^{n} \frac{k^2}{n}P_k + \sum_{k=1}^{n} \frac{k}{n}P_k \\ && \text{Using the identity } & \frac{k}{n}P_k = \frac{k}{n} \prod_{i=1}^{k-1} \left(1-\frac{i}{n}\right) = P_k - P_{k+1}: \\ && \sum_{k=1}^{n} \frac{k}{n}P_k &= (P_1 - P_2) + (P_2 - P_3) + \cdots + (P_n - P_{n+1}) \\ && &= P_1 - P_{n+1} = 1 - 0 = 1 \\ \\ \Rightarrow && \mathbb{E}(X) &= \sum_{k=1}^{n} \frac{k^2}{n}P_k + 1 \\ && &= \mathbb{P}(X \geq 1) + \mathbb{P}(X \geq 2) + \mathbb{P}(X \geq 3) + \cdots \\ && &= 1 + P_1 + P_2 + P_3 + \cdots \\ && &= 1 + \sum_{k=1}^{n} P_k \\ \\ \Rightarrow && 1 + \sum_{k=1}^{n} P_k &= \sum_{k=1}^{n} \frac{k^2}{n}P_k + 1 \\ \Rightarrow && \sum_{k=1}^{n} P_k &= \sum_{k=1}^{n} \frac{k^2}{n}P_k \\ \Rightarrow && 0 &= \sum_{k=1}^{n} P_k \left( 1 - \frac{k^2}{n} \right) \end{align*}
Examiner's report
— 2014 STEP 2, Question 13

This was the more popular of the two questions on Probability and Statistics, but as in previous years it still only attracted answers from a very small number of candidates. The average mark for this question was also quite low, often due to a difficulty in explaining the reasoning behind some of the parts of the question. Many candidates were able to find the expression for P(X ≤ 4) and most were then able to obtain the general formula required in part (i) of the question, although a number of candidates did not include the correct number of factors in the answer. Parts (ii) and (iii) did not cause too much difficulty, but the final part required a clear explanation to gain full marks.

There were good solutions presented to all of the questions, although there was generally less success in those questions that required explanations of results or the use of diagrams and graphs to reach the solution. Algebraic manipulation was generally well done by many of the candidates although a range of common errors such as confusing differentiation and integration and simple arithmetic slips were evident. Candidates should also be advised to use the methods that are asked for in questions unless it is clear that other methods will be accepted (such as by the use of the phrase "or otherwise").

Source: Cambridge STEP 2014 Examiner's Report · 2014-full.pdf
Rating Information

Difficulty Rating: 1600.0

Difficulty Comparisons: 0

Banger Rating: 1469.5

Banger Comparisons: 2

Show LaTeX source
Problem source
A random number generator prints out a sequence of integers $I_1, I_2, I_3, \dots$. Each integer is independently equally likely to be any one of $1, 2, \dots, n$, where $n$ is fixed.  The random variable $X$ takes the value $r$, where $I_r$ is the first integer which is a repeat of some earlier integer. Write down an expression for $\mathbb{P}(X=4)$.
  \begin{questionparts}
  \item Find an expression for $\mathbb{P}(X=r)$, where $2\le r\le n+1$.  Hence show that, for any positive integer $n$,
    \[
    \frac 1n + \left(1-\frac1n\right) \frac 2 n +
    \left(1-\frac1n\right)\left(1-\frac2n\right) \frac3 n + \cdots  \ =  \ 1
    \,.
    \]
  \item Write down an expression for $\mathbb{E}(X)$.  (You do not need to simplify it.)
  \item Write down an expression for $\mathbb{P}(X\ge k)$.
  \item Show that, for any discrete random variable $Y$ taking the values $1, 2, \dots, N$,
    \[
    \mathbb{E}(Y) = \sum_{k=1}^N \mathbb{P}(Y\ge k)\,.
    \]
    Hence show that, for any positive integer $n$,
    \[
    \left(1-\frac{1^2}n\right) +
    \left(1-\frac1n\right)\left(1-\frac{2^2}n\right) +
    \left(1-\frac1n\right)\left(1-\frac{2}n\right)\left(1-\frac{3^2}n\right)
    + \cdots \ = \ 0.
    \]
  \end{questionparts}
Solution source
\begin{align*}
&& \mathbb{P}(X > 4) &= 1 \cdot \frac{n-1}{n} \cdot \frac{n-2}{n} \cdot \frac{n-3}{n} \\
&& \mathbb{P}(X > 3) &= 1 \cdot \frac{n-1}{n} \cdot \frac{n-2}{n}  \\
\Rightarrow && \mathbb{P}(X =4) &= \mathbb{P}(X > 3) - \mathbb{P}(X > 4) \\
&&&= \frac{(n-1)(n-2)}{n^2} \left (1 - \frac{n-3}{n} \right) \\
&&&= \frac{3(n-1)(n-2)}{n^3}
\end{align*}

\begin{questionparts}
\item Notice that 
\begin{align*}
&& \mathbb{P}(X > r) &= \frac{n-1}{n} \cdots \frac{n-r+1}{n} \\
\Rightarrow && \mathbb{P}(X = r) &= \frac{n-1}{n} \cdots \frac{n-r+2}{n} \left (1 - \frac{n-r+1}{n} \right) \\
&&&= \frac{(n-1)\cdots(n-r+2)(r-1)}{n^{r-1}} \\
&&&= \left (1 - \frac{1}n \right)\left (1 - \frac{2}{n} \right) \cdots \left (1 - \frac{r-2}{n} \right) \frac{r-1}{n} \\
\Rightarrow && 1 &= \sum \mathbb{P}(X = r) \\
&&&= \sum_{r=2}^{n+1} \mathbb{P}(X = r) \\
&&&=    \frac 1n + \left(1-\frac1n\right) \frac 2 n +
    \left(1-\frac1n\right)\left(1-\frac2n\right) \frac3 n + \cdots
\end{align*}

\item $\,$ \begin{align*}
\mathbb{E}(X) &= \sum_{r=2}^{n+1} r\cdot\mathbb{P}(X = r) \\
&=  \frac 2n + \left(1-\frac1n\right) \frac {2\cdot3} n +
    \left(1-\frac1n\right)\left(1-\frac2n\right) \frac{3\cdot4} n + \cdots
\end{align*}

\item $\displaystyle \mathbb{P}(X \geq k)  = \frac{n-1}{n} \cdots \frac{n-r+2}{n}$

\item $\,$ \begin{align*}
&& \mathbb{E}(Y) &= \sum_{r=1}^N r \cdot \mathbb{P}(Y = r) \\
&&&= \sum_{r=1}^N \sum_{j=1}^r \mathbb{P}(Y = r) \\
&&&= \sum_{j=1}^N \sum_{r=j}^N \mathbb{P}(Y=r) \\
&&&=  \sum_{j=1}^N \mathbb{P}(Y \geq j)
\end{align*}

Let $P_k = \left(1-\frac1n\right)\left(1-\frac2n\right) \cdots \left(1-\frac1n\right)\left(1-\frac{k}n\right) $

\begin{align*}
&& \mathbb{E}(X) &= P_1 \frac{1 \cdot 2 }{n} + P_2 \cdot \frac{2 \cdot 3}{n} + \cdots + P_k \cdot \frac{k(k+1)}{n} + \cdots \\
&& &= \sum_{k=1}^{n} \frac{k^2}{n}P_k + \sum_{k=1}^{n} \frac{k}{n}P_k \\
&& \text{Using the identity } & \frac{k}{n}P_k = \frac{k}{n} \prod_{i=1}^{k-1} \left(1-\frac{i}{n}\right) = P_k - P_{k+1}: \\
&& \sum_{k=1}^{n} \frac{k}{n}P_k &= (P_1 - P_2) + (P_2 - P_3) + \cdots + (P_n - P_{n+1}) \\
&& &= P_1 - P_{n+1} = 1 - 0 = 1 \\
\\
\Rightarrow && \mathbb{E}(X) &= \sum_{k=1}^{n} \frac{k^2}{n}P_k + 1 \\
&& &= \mathbb{P}(X \geq 1) + \mathbb{P}(X \geq 2) + \mathbb{P}(X \geq 3) + \cdots \\
&& &= 1 + P_1 + P_2 + P_3 + \cdots \\
&& &= 1 + \sum_{k=1}^{n} P_k \\
\\
\Rightarrow && 1 + \sum_{k=1}^{n} P_k &= \sum_{k=1}^{n} \frac{k^2}{n}P_k + 1 \\
\Rightarrow && \sum_{k=1}^{n} P_k &= \sum_{k=1}^{n} \frac{k^2}{n}P_k \\
\Rightarrow && 0 &= \sum_{k=1}^{n} P_k \left( 1 - \frac{k^2}{n} \right)
\end{align*}
\end{questionparts}