Problems

Filters
Clear Filters

40 problems found

1997 Paper 3 Q12
D: 1700.0 B: 1500.0

  1. I toss a biased coin which has a probability \(p\) of landing heads and a probability \(q=1-p\) of landing tails. Let \(K\) be the number of tosses required to obtain the first head and let \[ \mathrm{G}(s)=\sum_{k=1}^{\infty}\mathrm{P}(K=k)s^{k}. \] Show that \[ \mathrm{G}(s)=\frac{ps}{1-qs} \] and hence find the expectation and variance of \(K\).
  2. I sample cards at random with replacement from a normal pack of \(52\). Let \(N\) be the total number of draws I make in order to sample every card at least once. By expressing \(N\) as a sum \(N=N_{1}+N_{2}+\cdots+N_{52}\) of random variables, or otherwise, find the expectation of \(N\). Estimate the numerical value of this expectation, using the approximations \(\mathrm{e}\approx2.7\) and \(1+\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{n}\approx0.5+\ln n\) if \(n\) is large.


Solution:

  1. Let \(N_i\) be the number of draws between the \((i-1)\)th new card and the \(i\)th new card. (Where \(N_1 = 1\)0 then \(N_i \sim K\) with \(p = \frac{53-i}{52}\)). Therefore \begin{align*} \E[N] &= \E[N_1 + \cdots + N_{52}] \\ &= \E[N_1] + \cdots + \E[N_i] + \cdots + \E[N_{52}] \\ &= 1 + \frac{52}{51} + \cdots + \frac{52}{53-k} + \cdots + \frac{52}{1} \\ &= 52 \left (1 + \frac{1}{2} + \cdots + \frac{1}{52} \right) \\ &= 52 \cdot \left ( 1 + \ln 52 \right) \end{align*} Notice that \(2.7 \times 2.7 = 7.29\) and \(7.3 \times 7.3 \approx 53.3\) so \(\ln 52 \approx 4\) and so our number is \(\approx 52 \cdot 4.5 =234\). [The correct answer actual number is 235.9782]

1997 Paper 3 Q13
D: 1700.0 B: 1500.0

Let \(X\) and \(Y\) be independent standard normal random variables: the probability density function, \(\f\), of each is therefore given by \[ \f(x)=\left(2\pi\right)^{-\frac{1}{2}}\e^{-\frac{1}{2}x^{2}}. \]

  1. Find the moment generating function \(\mathrm{E}(\e^{\theta X})\) of \(X\).
  2. Find the moment generating function of \(aX+bY\) and hence obtain the condition on \(a\) and \(b\) which ensures that \(aX+bY\) has the same distribution as \(X\) and \(Y\).
  3. Let \(Z=\e^{\mu+\sigma X}\). Show that \[ \mathrm{E}(Z^{\theta})=\e^{\mu\theta+\frac{1}{2}\sigma^{2}\theta^{2}}, \] and hence find the expectation and variance of \(Z\).


Solution:

  1. \(\,\) \begin{align*} && \E[e^{\theta X}] &= \int_{-\infty}^{\infty} e^{\theta x} \frac{1}{\sqrt{2\pi}} e^{-\frac12 x^2 } \d x\\ &&&= \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac12 x^2+\theta x} \d x\\ &&&= \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac12 (x^2-2\theta x)} \d x\\ &&&= \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac12 (x-\theta )^2+\frac12\theta^2 } \d x\\ &&&= e^{\frac12\theta^2 }\int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac12 (x-\theta )^2 } \d x\\ &&&=e^{\frac12\theta^2 } \end{align*}
  2. \begin{align*} && M_{aX+bY} (\theta) &= \mathbb{E}[e^{\theta (aX+bY)}] \\ &&&= e^{\frac12(a\theta)^2} \cdot e^{\frac12(b\theta)^2} \\ &&&= e^{\frac12(a^2+b^2)\theta^2} \end{align*} Therefore we need \(a^2+b^2 = 1\)
  3. \(\,\) \begin{align*} && \E[Z^\theta] &= \E[e^{\mu \theta + \sigma \theta X}] \\ &&&= e^{\mu \theta}e^{\frac12 \sigma^2 \theta^2} \\ &&&=e^{\mu \theta + \frac12 \sigma^2 \theta^2} \\ \end{align*} \begin{align*} \mathbb{E}(Z) &= \mathbb{E}[Z^1] \\ &= e^{\mu + \frac12 \sigma^2} \\ \var[Z] &= \E[Z^2] - \left ( \E[Z] \right)^2 \\ &= e^{2 \mu+ 2\sigma^2} - e^{2\mu + \sigma^2} \\ &= e^{2\mu+\sigma^2} \left (e^{\sigma^2}-1 \right) \end{align*} [NB: This is the lognormal distribution]

1997 Paper 3 Q14
D: 1700.0 B: 1516.0

An industrial process produces rectangular plates of mean length \(\mu_{1}\) and mean breadth \(\mu_{2}\). The length and breadth vary independently with non-zero standard deviations \(\sigma_{1}\) and \(\sigma_{2}\) respectively. Find the means and standard deviations of the perimeter and of the area of the plates. Show that the perimeter and area are not independent.


Solution: Let \(L \sim N(\mu_1, \sigma_1^2)\), \(B \sim N(\mu_2, \sigma_2)^2\), so \begin{align*} && \mathbb{E}(\text{perimeter}) &= \E(2(L+B)) \\ &&&= 2\E[L]+2\E[B] \\ &&&= 2(\mu_1+\mu_2) \\ &&\var[\text{perimeter}] &= \E\left [ (2(L+B))^2 \right] - \left ( \E[2(L+B)] \right)^2 \\ &&&= 4\E[L^2+2LB+B^2] - 4(\mu_1+\mu_2)^2 \\ &&&= 4(\sigma_1^2+\mu_1^2+2\mu_1\mu_2+\sigma_2^2+\mu_2^2) - 4(\mu_1+\mu_2)^2\\ &&&= 4(\sigma_1^2+\sigma_2^2) \\ &&\text{sd}[\text{perimeter}] &= 2\sqrt{\sigma_1^2+\sigma_2^2} \\ \\ && \E[\text{area}] &= \E[LB] \\ &&&= \E[L]\E[B] \\ &&&= \mu_1\mu_2 \\ && \var[\text{area}] &= \E[(LB)^2] - \left (\E[LB] \right)^2 \\ &&&= \E[L^2]\E[B^2]-\mu_1^2\mu_2^2 \\ &&&= (\mu_1^2+\sigma_1^2)(\mu_2^2+\sigma_2^2) -\mu_1^2\mu_2^2 \\ &&&= \sigma_1^2\mu_2^2 + \sigma_2^2\mu_1^2 + \sigma_1^2\sigma_2^2\\ && \text{sd}(\text{area}) &= \sqrt{\sigma_1^2\mu_2^2 + \sigma_2^2\mu_1^2 + \sigma_1^2\sigma_2^2} \\ \\ && \E[\text{perimeter} \cdot \text{area}] &= \E[2(L+B)LB] \\ &&&= 2\E[L^2]\E[B] + 2\E[L]\E[B^2] \\ &&&= 2(\sigma_1^2+\mu_1^2)\mu_2 + 2(\sigma_2^2+\mu_2^2)\mu_1 \\ && \E[\text{perimeter}] \E[\text{area}] &= 2(\mu_1+\mu_2) \cdot \mu_1\mu_2 \end{align*} Since the latter does not depend on \(\sigma_i\) but the former does they cannot be equal in general, therefore they cannot be independent. [See also STEP 2006 Paper 3 Q14]

1996 Paper 1 Q13
D: 1500.0 B: 1527.6

I have a Penny Black stamp which I want to sell to my friend Jim, but we cannot agree a price. So I put the stamp under one of two cups, jumble them up, and let Jim guess which one it is under. If he guesses correctly, I add a third cup, jumble them up, and let Jim guess correctly, adding another cup each time. The price he pays for the stamp is \(\pounds N,\) where \(N\) is the number of cups present when Jim fails to guess correctly. Find \(\mathrm{P}(N=k)\). Show that \(\mathrm{E}(N)=\mathrm{e}\) and calculate \(\mathrm{Var}(N).\)


Solution: \begin{align*} && \mathbb{P}(N = k) &= \mathbb{P}(\text{guesses }k-1\text{ correctly then 1 wrong})\\ &&&= \frac12 \cdot \frac{1}{3} \cdots \frac{1}{k-1} \frac{k-1}{k} \\ &&&= \frac{k-1}{k!} \\ &&\mathbb{E}(N) &= \sum_{k=2}^\infty k \cdot \mathbb{P}(N=k) \\ &&&= \sum_{k=2}^{\infty} \frac{k(k-1)}{k!} \\ &&&= \sum_{k=0}^{\infty} \frac{1}{k!} = e \\ && \textrm{Var}(N) &= \mathbb{E}(N^2) - \mathbb{E}(N)^2 \\ && \mathbb{E}(N^2) &= \sum_{k=2}^{\infty} k^2 \mathbb{P}(N=k) \\ &&&= \sum_{k=2}^{\infty} \frac{k^2(k-1)}{k!} \\ &&&= \sum_{k=0}^{\infty} \frac{k+2}{k!} \\ &&&= \sum_{k=0}^{\infty} \frac{1}{k!} + 2 \sum_{k=0}^{\infty} \frac{1}{k!} = 3e \\ \Rightarrow && \textrm{Var}(N) &= 3e-e^2 \end{align*}

1996 Paper 2 Q12
D: 1600.0 B: 1500.0

  1. Let \(X_{1}, X_{2}, \dots, X_{n}\) be independent random variables each of which is uniformly distributed on \([0,1]\). Let \(Y\) be the largest of \(X_{1}, X_{2}, \dots, X_{n}\). By using the fact that \(Y<\lambda\) if and only if \(X_{j}<\lambda\) for \(1\leqslant j\leqslant n\), find the probability density function of \(Y\). Show that the variance of \(Y\) is \[\frac{n}{(n+2)(n+1)^{2}}.\]
  2. The probability that a neon light switched on at time \(0\) will have failed by a time \(t>0\) is \(1-\mathrm{e}^{-t/\lambda}\) where \(\lambda>0\). I switch on \(n\) independent neon lights at time zero. Show that the expected time until the first failure is \(\lambda/n\).


Solution:

  1. \(\,\) \begin{align*} && F_Y(\lambda) &= \mathbb{P}(Y < \lambda) \\ &&&= \prod_i \mathbb{P}(X_i < \lambda) \\ &&&= \lambda^n \\ \Rightarrow && f_Y(\lambda) &= \begin{cases} n \lambda^{n-1} & \text{if } 0 \leq \lambda \leq 1 \\ 0 & \text{otherwise} \end{cases} \\ \\ && \E[Y] &= \int_0^1 \lambda f_Y(\lambda) \d \lambda \\ &&&= \int_0^1 n \lambda^n \d \lambda \\ &&&= \frac{n}{n+1} \\ && \E[Y^2] &= \int_0^1 \lambda^2 f_Y(\lambda) \d \lambda \\ &&&= \int_0^1 n \lambda^{n+1} \d \lambda \\ &&&= \frac{n}{n+2} \\ \Rightarrow && \var[Y] &= \E[Y^2]-(\E[Y])^2 \\ &&&= \frac{n}{n+2} - \frac{n^2}{(n+1)^2} \\ &&&= \frac{(n+1)^2n-n^2(n+2)}{(n+2)(n+1)^2} \\ &&&= \frac{n[(n^2+2n+1)-(n^2+2n)]}{(n+2)(n+1)^2} \\ &&&= \frac{n}{(n+2)(n+1)^2} \end{align*}
  2. Using the same reasoning, we can see that \begin{align*} && 1-F_Z(t) &= \mathbb{P}(\text{all lights still on after t}) \\ &&&= \prod_i e^{-t/\lambda} \\ &&&= e^{-nt/\lambda} \\ \\ \Rightarrow && F_Z(t) &= 1-e^{-nt/\lambda} \end{align*} Therefore \(Z \sim Exp(\frac{n}{\lambda})\) and the time to first failure is \(\lambda/n\)

1996 Paper 3 Q13
D: 1700.0 B: 1516.0

Let \(X\) be a random variable which takes only the finite number of different possible real values \(x_{1},x_{2},\ldots,x_{n}.\) Define the expectation \(\mathbb{E}(X)\) and the variance \(\var(X)\) of \(X\). Show that , if \(a\) and \(b\) are real numbers, then \(\E(aX+b)=a\E(X)+b\) and express \(\var(aX+b)\) similarly in terms of \(\var(X)\). Let \(\lambda\) be a positive real number. By considering the contribution to \(\var(X)\) of those \(x_{i}\) for which \(\left|x_{i}-\E(X)\right|\geqslant\lambda,\) or otherwise, show that \[ \mathrm{P}\left(\left|X-\E(X)\right|\geqslant\lambda\right)\leqslant\frac{\var(X)}{\lambda^{2}}\,. \] Let \(k\) be a real number satisfying \(k\geqslant\lambda.\) If \(\left|x_{i}-\E(X)\right|\leqslant k\) for all \(i\), show that \[ \mathrm{P}\left(\left|X-\E(X)\right|\geqslant\lambda\right)\geqslant\frac{\var(X)-\lambda^{2}}{k^{2}-\lambda^{2}}\,. \]


Solution: Definition: \(\displaystyle \mathbb{E}(X) = \sum_{i=1}^n x_i \mathbb{P}(X = x_i)\) Definition: \(\displaystyle \mathrm{Var}(X) = \sum_{i=1}^n (x_i-\mathbb{E}(X))^2 \mathbb{P}(X = x_i)\) Claim: \(\mathbb{E}(aX+b) = a\mathbb{E}(X)+b\) Proof: \begin{align*} \mathbb{E}(aX+b) &= \sum_{i=1}^n (ax_i+b) \mathbb{P}(X = x_i) \\ &= a\sum_{i=1}^n x_i \mathbb{P}(X = x_i) + b\sum_{i=1}^n \mathbb{P}(X = x_i)\\ &= a \mathbb{E}(X) + b \end{align*} Claim: \(\mathrm{Var}(aX+b) = a^2 \mathrm{Var}(X)\) Claim: \(\mathrm{P}\left(\left|X-\mathrm{E}(X)\right|\geqslant\lambda\right)\leqslant\frac{\mathrm{var}(X)}{\lambda^{2}}\) Proof: \begin{align*} \mathrm{Var}(X) &= \sum_{i=1}^n (x_i-\mathbb{E}(X))^2 \mathbb{P}(X = x_i) \\ &\geq \sum_{|x_i - \mathbb{E}(X)| \geq \lambda} (x_i-\mathbb{E}(X))^2 \mathbb{P}(X = x_i) \\ &\geq \sum_{|x_i - \mathbb{E}(X)| \geq \lambda} \lambda^2 \mathbb{P}(X = x_i) \\ &= \lambda^2 \sum_{|x_i - \mathbb{E}(X)| \geq \lambda} \mathbb{P}(X = x_i) \\ &= \lambda^2 \mathrm{P}\left(\left|X-\mathrm{E}(X)\right|\geqslant\lambda\right) \end{align*} Claim: \[ \mathrm{P}\left(\left|X-\mathrm{E}(X)\right|\geqslant\lambda\right)\geqslant\frac{\mathrm{var}(X)-\lambda^{2}}{k^{2}-\lambda^{2}}\,. \] Proof: \begin{align*} && \mathrm{Var}(X) &= \sum_{i=1}^n (x_i-\mathbb{E}(X))^2 \mathbb{P}(X = x_i) \\ &&&= \sum_{|x_i - \mathbb{E}(X)| \geq \lambda} (x_i-\mathbb{E}(X))^2 \mathbb{P}(X = x_i) + \sum_{|x_i - \mathbb{E}(X)| < \lambda} (x_i-\mathbb{E}(X))^2 \mathbb{P}(X = x_i) \\ &&& \leq \sum_{|x_i - \mathbb{E}(X)| \geq \lambda} k^2 \mathbb{P}(X = x_i) + \sum_{|x_i - \mathbb{E}(X)| < \lambda} \lambda^2 \mathbb{P}(X = x_i) \\ &&&= k^2 \mathbb{P}\left(\left|X-\mathrm{E}(X)\right|\geqslant\lambda\right) + \lambda^2 \mathbb{P}\left(\left|X-\mathrm{E}(X)\right| < \lambda\right) \\ &&&= k^2 \mathbb{P}\left(\left|X-\mathrm{E}(X)\right|\geqslant\lambda\right) + \lambda^2(1- \mathbb{P}\left(\left|X-\mathrm{E}(X)\right| \leq \lambda\right) \\ &&&= (k^2 - \lambda^2) \mathbb{P}\left(\left|X-\mathrm{E}(X)\right|\geqslant\lambda\right) + \lambda^2 \\ \Rightarrow&& \frac{\mathrm{Var}(X)-\lambda^2}{k^2 - \lambda^2} &\leq \mathbb{P}\left(\left|X-\mathrm{E}(X)\right|\geqslant\lambda\right) \end{align*} [Note: This result is known as Chebyshev's inequality, and is an important starting point to understanding the behaviour of tails of random variables]

1995 Paper 2 Q14
D: 1600.0 B: 1500.0

Suppose \(X\) is a random variable with probability density \[ \mathrm{f}(x)=Ax^{2}\exp(-x^{2}/2) \] for \(-\infty < x < \infty.\) Find \(A\). You belong to a group of scientists who believe that the outcome of a certain experiment is a random variable with the probability density just given, while other scientists believe that the probability density is the same except with different mean (i.e. the probability density is \(\mathrm{f}(x-\mu)\) with \(\mu\neq0\)). In each of the following two cases decide whether the result given would shake your faith in your hypothesis, and justify your answer.

  1. A single trial produces the result 87.3.
  2. 1000 independent trials produce results having a mean value \(0.23.\)
{[}Great weight will be placed on clear statements of your reasons and none on the mere repetition of standard tests, however sophisticated, if unsupported by argument. There are several possible approaches to this question. For some of them it is useful to know that if \(Z\) is normal with mean 0 and variance 1 then \(\mathrm{E}(Z^{4})=3.\){]}


Solution: Let \(Z \sim N(0,1)\), with a pdf of \(f(x) = \frac{1}{\sqrt{2\pi}} \exp(-x^2/2)\) \begin{align*} && 1 &= \int_{-\infty}^\infty Ax^2 \exp(-x^2/2) \d x \\ &&&= A\sqrt{2\pi} \int_{-\infty}^\infty x^2 \frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x \\ &&&= A\sqrt{2\pi} \E[Z^2] = A\sqrt{2\pi} \\ \Rightarrow && A &= \frac{1}{\sqrt{2\pi}} \end{align*}

  1. The probability of seeing a result as extreme as \(87.3\) is \begin{align*} \mathbb{P}(X > 87.3) &= \frac{1}{\sqrt{2\pi}}\int_{87.3}^{\infty} x^2 \exp(-x^2/2) \d x \\ &= \left [ -\frac{1}{\sqrt{2\pi}}x \exp(-x^2/2)\right]_{87.3}^{\infty}+\int_{87.3}^{\infty}\frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x \\ &\approx 0 +(1- \Phi(87.3)) \\ &\approx 0 \end{align*} It is very unlikely this data point has come from our distribution rather than one with a higher mean, therefore our faith is very shaken.
  2. If there are 1000 trials of this, we would expect the sample mean to be distributed according to the CLT. Each sample has mean \(0\) and variance \(\E[X^2] = \int_{-\infty}^\infty x^4 \frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x = \E[Z^4] = 3\), therefore the sample mean is \(N(0, 3/1000)\). Therefore the probability of being \(0.23\) away is \begin{align*} && \mathbb{P}(S > 0.23) &= \mathbb{P}\left (Z > \frac{0.23}{\sqrt{3/1000}} \right) \\ &&&= \mathbb{P}\left (Z > \frac{0.23}{\sqrt{30}/100} \right) \\ &&&\approx \mathbb{P}\left (Z > \frac{0.23}{0.055} \right) \\ &&& \approx 0 \end{align*} again our faith should be shaken

1994 Paper 2 Q14
D: 1600.0 B: 1502.2

When Septimus Moneybags throws darts at a dart board they are certain to end on the board (a disc of radius \(a\)) but, it must be admitted, otherwise are uniformly randomly distributed over the board.

  1. Show that the distance \(R\) that his shot lands from the centre of the board is a random variable with variance \(a^{2}/18.\)
  2. At a charity fete he can buy \(m\) throws for \(\pounds(12+m)\), but he must choose \(m\) before he starts to throw. If at least one of his throws lands with \(a/\sqrt{10}\) of the centre he wins back \(\pounds 12\). In order to show that a good sport he is, he is determined to play but, being a careful man, he wishes to choose \(m\) so as to minimise his expected loss. What values of \(m\) should he choose?


Solution:

  1. \(\,\) \begin{align*} && \mathbb{P}(R < d) &= \frac{\pi d^2}{\pi a^2} \\ &&&= \frac{d^2}{a^2} \\ \Rightarrow && f_R(d) &= \frac{2d}{a^2}\\ \\ && \E[R] &= \int_0^a x \cdot f_R(x) \d x \\ &&&= \int_0^a \frac{2x^2}{a^2} \d x \\ &&&= \frac{2a}{3} \\ \\ && \E[R^2] &= \int_0^a x^2 \cdot f_R(x) \d x \\ &&&= \int_0^a \frac{2x^3}{a^2} \d x \\ &&&= \frac{a^2}{2} \\ \Rightarrow && \var[R] &= \frac{a^2}2 - \frac{4a^2}{9} \\ &&7= \frac{a^2}{18} \end{align*}
  2. Let \(p = \mathbb{P}(R < \frac{a}{\sqrt{10}}) = \frac{a^2}{10a^2} = \frac{1}{10}\) be the probability of hitting the target on each throw. His expected loss is \((12+m)p^m + m(1-p^m) = 12p^m + m\). \begin{array}{c|c} m & \text{expected loss} \\ \hline 0 & 12 \\ 1 & \frac{12}{10} + 1 \approx 2.2 \\ 2 & \frac{12}{100} + 2 \approx 2.12 \\ \end{array} If he takes more than \(2\) throws it will definitely cost more than \(3\), therefore he should take exactly \(2\) throws.

1989 Paper 3 Q15
D: 1700.0 B: 1503.8

The continuous random variable \(X\) is uniformly distributed over the interval \([-c,c].\) Write down expressions for the probabilities that:

  1. \(n\) independently selected values of \(X\) are all greater than \(k\),
  2. \(n\) independently selected values of \(X\) are all less than \(k\),
where \(k\) lies in \([-c,c]\). A sample of \(2n+1\) values of \(X\) is selected at random and \(Z\) is the median of the sample. Show that \(Z\) is distributed over \([-c,c]\) with probability density function \[ \frac{(2n+1)!}{(n!)^{2}(2c)^{2n+1}}(c^{2}-z^{2})^{n}. \] Deduce the value of \({\displaystyle \int_{-c}^{c}(c^{2}-z^{2})^{n}\,\mathrm{d}z.}\) Evaluate \(\mathrm{E}(Z)\) and \(\mathrm{var}(Z).\)


Solution:

  1. \begin{align*} \mathbb{P}(n\text{ independent values of }X > k) &= \prod_{i=1}^n \mathbb{P}(X > k) \\ &= \left ( \frac{c-k}{2c}\right)^n \end{align*}
  2. \begin{align*} \mathbb{P}(n\text{ independent values of }X < k) &= \prod_{i=1}^n \mathbb{P}(X < k) \\ &= \left ( \frac{k+c}{2c}\right)^n \end{align*}
\begin{align*} &&\mathbb{P}(\text{median} < z+\delta \text{ and median} > z - \delta) &= \mathbb{P}(n\text{ values } < z - \delta \text{ and } n \text{ values} > z + \delta) \\ &&&= \binom{2n+1}{n,n,1} \left ( \frac{c-(z+\delta)}{2c}\right)^n\left ( \frac{(z-\delta)+c}{2c}\right)^n \frac{2 \delta}{2 c} \\ &&&= \frac{(2n+1)!}{n! n!} \frac{((c-(z+\delta))(c+(z-\delta)))^n 2\delta}{2^n c^n} \\ &&&= \frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}}((c-(z+\delta))(c+(z-\delta)))^n 2\delta \\ \Rightarrow && \lim_{\delta \to 0} \frac{\mathbb{P}(\text{median} < z+\delta \text{ and median} > z - \delta)}{2 \delta} &= \frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}}((c-z)(c+z))^n \\ &&&= \frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}}(c^2-z^2) \\ \end{align*} \begin{align*} && 1 &= \int_{-c}^c \frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}}(c^2-z^2)^n \d z \\ \Rightarrow && \frac{(n!)^2 (2c)^{2n+1}}{(2n+1)!} &= \int_{-c}^c (c^2-z^2)^n \d z \end{align*} \begin{align*} \mathbb{E}(Z) &= \int_{-c}^c z \frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}}(c^2-z^2)^n \d z \\ &=\frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}} \int_{-c}^c z (c^2-z^2)^n \d z \\ &= 0 \end{align*} \begin{align*} \mathrm{Var}(Z) &= \mathbb{E}(Z^2) - \mathbb{E}(Z)^2 \\ &= \mathbb{E}(Z^2) \\ &= \int_{-c}^c z^2 \frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}}(c^2-z^2)^n \d z \\ &=\frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}} \int_{-c}^c z^2 (c^2-z^2)^n \d z \\ &=\frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}} \left ( \left [ -\frac{1}{2(n+1)}z(c^2-z^2)^{n+1} \right]_{-c}^c + \frac{1}{2(n+1)}\int_{-c}^c (c^2-z^2)^{n+1} \d z \right) \\ &= \frac{(2n+1)!}{(n!)^2 (2c)^{2n+1}} \frac{1}{2(n+1)} \frac{((n+1)!)^2 (2c)^{2n+3}}{(2n+3)!} \\ &= \frac{(n+1)^2(2c)^2}{(n+1)(2n+2)(2n+3)} \\ &= \frac{2c^2}{2n+3} \end{align*}

1988 Paper 3 Q16
D: 1700.0 B: 1610.5

Balls are chosen at random without replacement from an urn originally containing \(m\) red balls and \(M-m\) green balls. Find the probability that exactly \(k\) red balls will be chosen in \(n\) choices \((0\leqslant k\leqslant m,0\leqslant n\leqslant M).\) The random variables \(X_{i}\) \((i=1,2,\ldots,n)\) are defined for \(n\leqslant M\) by \[ X_{i}=\begin{cases} 0 & \mbox{ if the \(i\)th ball chosen is green}\\ 1 & \mbox{ if the \(i\)th ball chosen is red. } \end{cases} \] Show that

  1. \(\mathrm{P}(X_{i}=1)=\dfrac{m}{M}.\)
  2. \(\mathrm{P}(X_{i}=1\mbox{ and }X_{j}=1)=\dfrac{m(m-1)}{M(M-1)}\), for \(i\neq j\).
Find the mean and variance of the random variable \(X\) defined by \[ X=\sum_{i=1}^{n}X_{i}. \]


Solution: There are \(\displaystyle \binom{m}{k} \binom{M-m}{n-k}\) ways to choose \(k\) red and and \(n-k\) green balls out of a total \(\displaystyle \binom{M}{n}\) ways to choose balls. Therefore the probability is: \[ \mathbb{P}(\text{exactly }k\text{ red balls in }n\text{ choices}) = \frac{\binom{m}{k} \binom{M-m}{n-k}}{ \binom{M}{n}}\]

  1. Note that there is nothing special about the \(i\)th ball chosen. (We could consider all draws look at the \(i\)th ball, or consider all draws apply a permutation to make the \(i\)th ball the first ball, and both would look like identical sequences). Therefore \(\mathbb{P}(X_i = 1) = \mathbb{P}(X_1 = 1) = \frac{m}{M}\).
  2. Similarly we could apply a permutation to all sequences which takes the \(i\)th ball to the first ball and the \(j\)th ball to the second ball, therefore: \begin{align*} \mathbb{P}(X_i = 1, X_j = 1) &= \mathbb{P}(X_1 = 1, X_2 = 1) \\ &= \mathbb{P}(X_1 = 1) \cdot \mathbb{P}(X_2 = 1 | X_1 = 1) \\ &= \frac{m}{M} \cdot \frac{m-1}{M-1} \\ &= \frac{m(m-1)}{M(M-1)} \end{align*}
So: \begin{align*} \mathbb{E}(X) &= \mathbb{E}(\sum_{i=1}^{n}X_{i}) \\ &= \sum_{i=1}^{n}\mathbb{E}(X_{i}) \\ &= \sum_{i=1}^{n} 1\cdot\mathbb{P}(X_i = 1) \\ &= \sum_{i=1}^{n} \frac{m}{M} \\ &= \frac{mn}{M} \end{align*} and \begin{align*} \mathbb{E}(X^2) &= \mathbb{E}\left[\left(\sum_{i=1}^{n}X_{i} \right)^2 \right] \\ &= \mathbb{E}\left[\sum_{i=1}^n X_i^2 + 2 \sum_{i < j} X_i X_j \right] \\ &= \sum_{i=1}^n \mathbb{E}(X_i^2) + 2 \sum_{i < j} \mathbb{E}(X_i X_j) \\ &= \frac{nm}{M} + n(n-1) \frac{m(m-1)}{M(M-1)} \\ \textrm{Var}(X) &= \mathbb{E}(X^2) - (\mathbb{E}(X))^2 \\ &= \frac{nm}{M} + n(n-1) \frac{m(m-1)}{M(M-1)} - \frac{n^2m^2}{M^2} \\ &= \frac{nm}{M} \left (1-\frac{nm}{M}+(n-1)\frac{m-1}{M-1} \right) \\ &= \frac{nm}{M} \left ( \frac{M(M-1)-(M-1)nm+(n-1)(m-1)M}{M(M-1)} \right) \\ &= \frac{nm}{M} \frac{(M-m)(M-n)}{M(M-1)} \\ &= n \frac{m}{M} \frac{M-m}{M} \frac{M-n}{M-1} \end{align*} Note: This is a very nice way of deriving the mean and variance of the hypergeometric distribution