1988 Paper 3 Q16

Year: 1988
Paper: 3
Question Number: 16

Course: LFM Stats And Pure
Section: Hypergeometric Distribution

Difficulty: 1700.0 Banger: 1610.5

hypergeometric distribution probability sampling without replacement indicator random variables expectation variance covariance combinatorics

Problem

Balls are chosen at random without replacement from an urn originally containing \(m\) red balls and \(M-m\) green balls. Find the probability that exactly \(k\) red balls will be chosen in \(n\) choices \((0\leqslant k\leqslant m,0\leqslant n\leqslant M).\) The random variables \(X_{i}\) \((i=1,2,\ldots,n)\) are defined for \(n\leqslant M\) by \[ X_{i}=\begin{cases} 0 & \mbox{ if the \(i\)th ball chosen is green}\\ 1 & \mbox{ if the \(i\)th ball chosen is red. } \end{cases} \] Show that

\(\mathrm{P}(X_{i}=1)=\dfrac{m}{M}.\)
\(\mathrm{P}(X_{i}=1\mbox{ and }X_{j}=1)=\dfrac{m(m-1)}{M(M-1)}\), for \(i\neq j\).

Find the mean and variance of the random variable \(X\) defined by \[ X=\sum_{i=1}^{n}X_{i}. \]

Solution

There are \(\displaystyle \binom{m}{k} \binom{M-m}{n-k}\) ways to choose \(k\) red and and \(n-k\) green balls out of a total \(\displaystyle \binom{M}{n}\) ways to choose balls. Therefore the probability is: \[ \mathbb{P}(\text{exactly }k\text{ red balls in }n\text{ choices}) = \frac{\binom{m}{k} \binom{M-m}{n-k}}{ \binom{M}{n}}\]

Note that there is nothing special about the \(i\)th ball chosen. (We could consider all draws look at the \(i\)th ball, or consider all draws apply a permutation to make the \(i\)th ball the first ball, and both would look like identical sequences). Therefore \(\mathbb{P}(X_i = 1) = \mathbb{P}(X_1 = 1) = \frac{m}{M}\).
Similarly we could apply a permutation to all sequences which takes the \(i\)th ball to the first ball and the \(j\)th ball to the second ball, therefore: \begin{align*} \mathbb{P}(X_i = 1, X_j = 1) &= \mathbb{P}(X_1 = 1, X_2 = 1) \\ &= \mathbb{P}(X_1 = 1) \cdot \mathbb{P}(X_2 = 1 | X_1 = 1) \\ &= \frac{m}{M} \cdot \frac{m-1}{M-1} \\ &= \frac{m(m-1)}{M(M-1)} \end{align*}

So: \begin{align*} \mathbb{E}(X) &= \mathbb{E}(\sum_{i=1}^{n}X_{i}) \\ &= \sum_{i=1}^{n}\mathbb{E}(X_{i}) \\ &= \sum_{i=1}^{n} 1\cdot\mathbb{P}(X_i = 1) \\ &= \sum_{i=1}^{n} \frac{m}{M} \\ &= \frac{mn}{M} \end{align*} and \begin{align*} \mathbb{E}(X^2) &= \mathbb{E}\left[\left(\sum_{i=1}^{n}X_{i} \right)^2 \right] \\ &= \mathbb{E}\left[\sum_{i=1}^n X_i^2 + 2 \sum_{i < j} X_i X_j \right] \\ &= \sum_{i=1}^n \mathbb{E}(X_i^2) + 2 \sum_{i < j} \mathbb{E}(X_i X_j) \\ &= \frac{nm}{M} + n(n-1) \frac{m(m-1)}{M(M-1)} \\ \textrm{Var}(X) &= \mathbb{E}(X^2) - (\mathbb{E}(X))^2 \\ &= \frac{nm}{M} + n(n-1) \frac{m(m-1)}{M(M-1)} - \frac{n^2m^2}{M^2} \\ &= \frac{nm}{M} \left (1-\frac{nm}{M}+(n-1)\frac{m-1}{M-1} \right) \\ &= \frac{nm}{M} \left ( \frac{M(M-1)-(M-1)nm+(n-1)(m-1)M}{M(M-1)} \right) \\ &= \frac{nm}{M} \frac{(M-m)(M-n)}{M(M-1)} \\ &= n \frac{m}{M} \frac{M-m}{M} \frac{M-n}{M-1} \end{align*} Note: This is a very nice way of deriving the mean and variance of the hypergeometric distribution

Rating Information

Difficulty Rating: 1700.0

Difficulty Comparisons: 0

Banger Rating: 1610.5

Banger Comparisons: 11

Show LaTeX source

Problem source

Balls are chosen at random without replacement from an urn originally containing $m$ red balls and $M-m$ green balls. Find the probability that exactly $k$ red balls will be chosen in $n$ choices $(0\leqslant k\leqslant m,0\leqslant n\leqslant M).$ 
The random variables $X_{i}$ $(i=1,2,\ldots,n)$ are defined for $n\leqslant M$ by 
\[
X_{i}=\begin{cases}
0 & \mbox{ if the $i$th ball chosen is green}\\
1 & \mbox{ if the $i$th ball chosen is red. }
\end{cases}
\]
Show that 
\begin{questionparts}
\item $\mathrm{P}(X_{i}=1)=\dfrac{m}{M}.$
\item $\mathrm{P}(X_{i}=1\mbox{ and }X_{j}=1)=\dfrac{m(m-1)}{M(M-1)}$,
for $i\neq j$. 
\end{questionparts}
Find the mean and variance of the random variable $X$ defined by
\[
X=\sum_{i=1}^{n}X_{i}.
\]

Solution source

There are $\displaystyle \binom{m}{k} \binom{M-m}{n-k}$ ways to choose $k$ red and and $n-k$ green balls out of a total $\displaystyle \binom{M}{n}$ ways to choose balls. Therefore the probability is:
\[ \mathbb{P}(\text{exactly }k\text{ red balls in }n\text{ choices})  = \frac{\binom{m}{k} \binom{M-m}{n-k}}{ \binom{M}{n}}\]

\begin{questionparts}
\item Note that there is nothing special about the $i$th ball chosen. (We could consider all draws look at the $i$th ball, or consider all draws apply a permutation to make the $i$th ball the first ball, and both would look like identical sequences). Therefore $\mathbb{P}(X_i = 1) = \mathbb{P}(X_1 = 1) = \frac{m}{M}$.

\item Similarly we could apply a permutation to all sequences which takes the $i$th ball to the first ball and the $j$th ball to the second ball, therefore:

\begin{align*}
\mathbb{P}(X_i = 1, X_j = 1) &= \mathbb{P}(X_1 = 1, X_2 = 1) \\
&= \mathbb{P}(X_1 = 1) \cdot \mathbb{P}(X_2 = 1 | X_1 = 1) \\
&= \frac{m}{M} \cdot \frac{m-1}{M-1} \\
&= \frac{m(m-1)}{M(M-1)}
\end{align*}
\end{questionparts}

So:

\begin{align*}
\mathbb{E}(X) &= \mathbb{E}(\sum_{i=1}^{n}X_{i}) \\
&= \sum_{i=1}^{n}\mathbb{E}(X_{i}) \\
&= \sum_{i=1}^{n} 1\cdot\mathbb{P}(X_i = 1) \\
&= \sum_{i=1}^{n} \frac{m}{M} \\ 
&= \frac{mn}{M}
\end{align*}

and

\begin{align*}
\mathbb{E}(X^2) &=  \mathbb{E}\left[\left(\sum_{i=1}^{n}X_{i} \right)^2 \right] \\
&= \mathbb{E}\left[\sum_{i=1}^n X_i^2 + 2 \sum_{i < j} X_i X_j \right] \\
&= \sum_{i=1}^n \mathbb{E}(X_i^2) + 2 \sum_{i < j} \mathbb{E}(X_i X_j) \\
&= \frac{nm}{M} + n(n-1) \frac{m(m-1)}{M(M-1)} \\
\textrm{Var}(X) &= \mathbb{E}(X^2) - (\mathbb{E}(X))^2 \\
&=  \frac{nm}{M} + n(n-1) \frac{m(m-1)}{M(M-1)}  - \frac{n^2m^2}{M^2} \\
&= \frac{nm}{M} \left (1-\frac{nm}{M}+(n-1)\frac{m-1}{M-1} \right) \\
&= \frac{nm}{M} \left ( \frac{M(M-1)-(M-1)nm+(n-1)(m-1)M}{M(M-1)} \right) \\
&= \frac{nm}{M} \frac{(M-m)(M-n)}{M(M-1)} \\
&= n \frac{m}{M} \frac{M-m}{M} \frac{M-n}{M-1}
\end{align*}


Note: This is a very nice way of deriving the mean and variance of the hypergeometric distribution

Back to List