The point \(P\) lies on the circumference of a circle of unit radius and centre \(O\). The angle, \(\theta\), between \(OP\) and the positive \(x\)-axis is a random variable, uniformly distributed on the interval \(0\le\theta<2\pi\).
The cartesian coordinates of \(P\) with respect to \(O\) are \((X,Y)\).
Find the probability density function for \(X\), and calculate \(\var (X)\).
Show that \(X\) and \(Y\) are uncorrelated and discuss briefly whether they are independent.
The points \(P_i\) (\(i=1\), \(2\), \(\ldots\) , \(n\)) are chosen independently on the circumference of the circle, as in part (i), and have cartesian coordinates \((X_i, Y_i)\).
The point \(\overline P\) has coordinates \((\overline X, \overline Y)\), where \(\overline X =\dfrac1n \sum\limits _{i=1}^n X_i\) and \(\overline Y =\dfrac1n \sum\limits _{i=1}^n Y_i\).
Show that \(\overline X\) and \(\overline Y\) are uncorrelated.
Show that, for large \(n\), \(\displaystyle \P\left(\vert \overline X \vert \le \sqrt{\frac2n}\right)\approx 0.95\,\).
Two coins \(A\) and \(B\) are tossed together. \(A\) has
probability \(p\) of showing a head, and \(B\) has probability \(2p\), independent of \(A\),
of showing a head,
where \(0 < p < \frac12\).
The random variable \(X\) takes the value 1 if \(A\)
shows a head and it takes the value \(0\) if \(A\) shows a tail.
The random variable \(Y\) takes the value 1 if \(B\)
shows a head and it takes the value \(0\) if \(B\) shows a tail.
The random variable \(T\) is defined by
\[
T= \lambda X + {\textstyle\frac12} (1-\lambda)Y.
\]
Show that \(\E(T)=p\) and find an expression for \(\var(T)\) in terms of \(p\) and \(\lambda\).
Show that as \(\lambda\) varies, the minimum of \(\var(T)\) occurs when
\[
\lambda =\frac{1-2p}{3-4p}\;.
\]
The two coins are tossed \(n\) times, where \(n>30\), and \(\overline{T}\) is the mean value of \(T\).
Let \(b\) be a fixed positive number. Show that the maximum value of
\(\P\big(\vert \overline{T}-p\vert < b\big)\) as \(\lambda\) varies is approximately \(2\Phi(b/s)-1\),
where \(\Phi\) is the cumulative distribution function of a standard normal variate and
\[
s^2= \frac{p(1-p)(1-2p)}{(3-4p)n}\;.
\]
A random variable \(X\) is distributed uniformly on \([\, 0\, , \, a\,]\).
Show that the variance of \(X\) is \({1 \over 12} a^2\).
A sample, \(X_1\) and \(X_2\), of two independent values of the random variable is drawn, and the variance \(V\) of the sample is determined. Show that \(V = {1 \over 4} \l X_1 -X_2 \r ^2\), and hence prove that \(2 V\) is an unbiased estimator of the variance of X.
Find an exact expression for the probability that the value of \(V\) is less than \({1 \over 12} a^2\) and estimate the value of this probability correct to one significant figure.
We need \(|X_1 - X_2| < \frac{a}{\sqrt{3}}\)
We are interested in the blue area, which is \(a^2 - a^2(1- \frac{1}{\sqrt{3}})^2 = a^2 \left ( \frac{2}{\sqrt{3}} - \frac13 \right)\) ie the probability is \(\frac{2\sqrt{3}-1}{3} \approx 0.8\)
The random variables \(X_1\), \(X_2\), \(\ldots\) , \(X_{2n+1}\) are
independently and uniformly distributed on the interval
\(0 \le x \le 1\). The random variable \(Y\) is defined to be the
median of \(X_1\), \(X_2\), \(\ldots\) , \(X_{2n+1}\).
Given that the probability density function of \(Y\) is \(\g(y)\), where
\[
\mathrm{g}(y)=\begin{cases}
ky^{n}(1-y)^{n} & \mbox{ if }0\leqslant y\leqslant1\\
0 & \mbox{ otherwise}
\end{cases}
\]
use the result
$$
\int_0^1 {y^{r}}{{(1-y)}^{s}}\,\d y =
\frac{r!s!}{(r+s+1)!}
$$
to show that \(k={(2n+1)!}/{{(n!)}^2}\), and evaluate
\(\E(Y)\) and \({\rm Var}\,(Y)\).
Hence show that,
for any given positive number \(d\), the inequality
$$
{\P\left({\vert {Y - 1/2} \vert} < {d/{\sqrt {n}}} \right)} <
{\P\left({\vert {{\bar X} - 1/2} \vert} < {d/{\sqrt {n}}} \right)}
$$
holds provided \(n\) is large enough, where
\({\bar X}\) is the mean of \(X_1\), \(X_2\), \(\ldots\) , \(X_{2n+1}\).
[You may assume that \(Y\) and \(\bar X\) are normally distributed
for large \(n\).]
A hostile naval power possesses a large, unknown number \(N\) of
submarines. Interception of radio signals yields a small number \(n\)
of their identification numbers \(X_i\) (\(i=1,2,...,n\)), which are taken
to be independent and uniformly distributed over the continuous range
from \(0\) to \(N\). Show that \(Z_1\) and \(Z_2\), defined by
$$
Z_1 = {n+1\over n} {\max}\{X_1,X_2,...,X_n\}
\hspace{0.3in} {\rm and} \hspace{0.3in}
Z_2 = {2\over n} \sum_{i=1}^n X_i \;,
$$
both have means equal to \(N\).
Calculate the variance of \(Z_1\) and of \(Z_2\). Which estimator
do you prefer, and why?
An experiment produces a random number \(T\) uniformly distributed on \([0,1]\). Let \(X\) be the larger root of the equation
\[x^{2}+2x+T=0.\]
What is the probability that \(X>-1/3\)? Find \(\mathbb{E}(X)\) and show that \(\mathrm{Var}(X)=1/18\). The experiment is repeated independently 800 times generating the larger roots \(X_{1}, X_{2}, \dots, X_{800}\). If
\[Y=X_{1}+X_{2}+\dots+X_{800}.\]
find an approximate value for \(K\) such that
\[\mathrm{P}(Y\leqslant K)=0.08.\]
The random variable \(X\) is
uniformly distributed on \([0,1]\). A new random variable
\(Y\) is defined by the rule
\[
Y=\begin{cases}
1/4 & \mbox{ if }X\leqslant1/4,\\
X & \mbox{ if }1/4\leqslant X\leqslant3/4\\
3/4 & \mbox{ if }X\geqslant3/4.
\end{cases}
\]
Find \({\mathrm E}(Y^{n})\) for all integers \(n\geqslant 1\).
Show that \({\mathrm E}(Y)={\mathrm E}(X)\) and that
\[{\mathrm E}(X^{2})-{\mathrm E}(Y^{2})=\frac{1}{24}.\]
By using the fact that \(4^{n}=(3+1)^{n}\), or otherwise,
show that \({\mathrm E}(X^{n}) > {\mathrm E}(Y^{n})\) for \(n\geqslant 2\).
Suppose that \(Y_{1}\), \(Y_{2}\), \dots are independent random variables
each having the same distribution as \(Y\).
Find, to a good approximation, \(K\) such that
\[{\rm P}(Y_{1}+Y_{2}+\cdots+Y_{240000} < K)=3/4.\]
Suppose \(X\) is a random variable with probability density
\[
\mathrm{f}(x)=Ax^{2}\exp(-x^{2}/2)
\]
for \(-\infty < x < \infty.\) Find \(A\).
You belong to a group of scientists who believe that the outcome of a certain experiment is a random variable with the probability density just given, while other scientists believe that the probability density is the same except with different mean (i.e. the probability density is \(\mathrm{f}(x-\mu)\) with \(\mu\neq0\)). In each of the following two cases decide whether the result given would shake your faith in your hypothesis, and justify your answer.
A single trial produces the result 87.3.
1000 independent trials produce results having a mean value \(0.23.\)
{[}Great weight will be placed on clear statements of your reasons and none on the mere repetition of standard tests, however sophisticated, if unsupported by argument. There are several possible approaches to this question. For some of them it is useful to know that if \(Z\) is normal with mean 0 and variance 1 then \(\mathrm{E}(Z^{4})=3.\){]}
Show Solution
Let \(Z \sim N(0,1)\), with a pdf of \(f(x) = \frac{1}{\sqrt{2\pi}} \exp(-x^2/2)\)
\begin{align*}
&& 1 &= \int_{-\infty}^\infty Ax^2 \exp(-x^2/2) \d x \\
&&&= A\sqrt{2\pi} \int_{-\infty}^\infty x^2 \frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x \\
&&&= A\sqrt{2\pi} \E[Z^2] = A\sqrt{2\pi} \\
\Rightarrow && A &= \frac{1}{\sqrt{2\pi}}
\end{align*}
The probability of seeing a result as extreme as \(87.3\) is \begin{align*}
\mathbb{P}(X > 87.3) &= \frac{1}{\sqrt{2\pi}}\int_{87.3}^{\infty} x^2 \exp(-x^2/2) \d x \\
&= \left [ -\frac{1}{\sqrt{2\pi}}x \exp(-x^2/2)\right]_{87.3}^{\infty}+\int_{87.3}^{\infty}\frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x \\
&\approx 0 +(1- \Phi(87.3)) \\
&\approx 0
\end{align*}
It is very unlikely this data point has come from our distribution rather than one with a higher mean, therefore our faith is very shaken.
If there are 1000 trials of this, we would expect the sample mean to be distributed according to the CLT. Each sample has mean \(0\) and variance \(\E[X^2] = \int_{-\infty}^\infty x^4 \frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x = \E[Z^4] = 3\), therefore the sample mean is \(N(0, 3/1000)\). Therefore the probability of being \(0.23\) away is
\begin{align*}
&& \mathbb{P}(S > 0.23) &= \mathbb{P}\left (Z > \frac{0.23}{\sqrt{3/1000}} \right) \\
&&&= \mathbb{P}\left (Z > \frac{0.23}{\sqrt{30}/100} \right) \\
&&&\approx \mathbb{P}\left (Z > \frac{0.23}{0.055} \right) \\
&&& \approx 0
\end{align*}
again our faith should be shaken
The average number of pedestrians killed annually in road accidents in Poldavia during the period 1974-1989 was 1080 and the average number killed annually in commercial flight accidents during the same period was 180. Discuss the following newspaper headlines which appeared in 1991. (The percentage figures in square brackets give a rough indication of the weight of marks attached to each discussion.)
[\(10\%\)] Six Times Safer To Fly Than To Walk. 1974-1989 Figures Prove It.
[\(10\%\)] Our Skies Are Safer. Only 125 People Killed In Air Accidents In 1990.
[\(30\%\)] Road Carnage Increasing. 7 People Killed On Tuesday.
[\(50\%\)] Alarming Rise In Pedestrian Casualties. 1350 Pedestrians Killed In Road Accidents During 1990.
We cannot say this, since we do not know how many people were flying or walking each year.
This is difficult to say without knowing the variance. We might expect this to have quite a skewed distribution (one big air crash causes lots of deaths infrequently) so it's impossible to know, although it is substantially lower.
If we have 1080 deaths annually, we should expect ~3 deaths per day. While a day with \(7\) deaths might seem unlikely, over the course of a year it is very likely to occur. (Perhaps the weather was bad). It is also probably a case of selective reporting, we are seeing this data point because it's notable and being reported rather than because it is significant).
This is certainly the most alarming, a ~25% increase is very unlikely without something else going on. (We'd expect it to be ~Po(1080) approximalely N(1080, 1080) but then this is many standard deviations away). However we also know that other factors could drive this (more walking, more people, change in reporting etc)
A fair coin is thrown \(n\) times. On each throw, 1 point is scored for a head and 1 point is lost for a tail. Let \(S_{n}\) be the points total for the series of \(n\) throws, i.e. \(S_{n}=X_{1}+X_{2}+\cdots+X_{n},\)
where
\[
X_{j}=\begin{cases}
1 & \text{ if the }j \text{ th throw is a head}\\
-1 & \text{ if the }j\text{ th throw is a tail.}
\end{cases}
\]
If \(n=10\,000,\) find an approximate value for the probability that
\(S_{n}>100.\)
Find an approximate value for the least \(n\) for which \(\mathrm{P}(S_{n}>0.01n)<0,01.\)
Suppose that instead no points are scored for the first throw, but that on each successive throw, 2 points are scored if both it and the first throw are heads, two points are deducted if both are tails, and no points are scored or lost if the throws differ. Let \(Y_{k}\) be the score on the \(k\)th throw, where \(2\leqslant k\leqslant n.\)
Show that \(Y_{k}=X_{1}+X_{k}.\)
Calculate the mean and variance of each \(Y_{k}\) and determine whether it is true that
\[
\mathrm{P}(Y_{2}+Y_{3}+\cdots+Y_{n}>0.01(n-1))\rightarrow0\quad\mbox{ as }n\rightarrow\infty.
\]
Show Solution
Notice that \(\mathbb{E}(X_i) = 0, \mathbb{E}(X_i^2) = 1\) and so \(\mathbb{E}(S_n) =0, \textrm{Var}(S_n) = n\).
Then by the central limit theorem (or alternatively the normal approximation to the binomial),
\begin{align*}
&& \mathbb{P}(S_n > 100) &\underbrace{\approx}_{\text{CLT}} \mathbb{P} \left (Z > \frac{100}{\sqrt{10\, 000}} \right) \\
&&&= \mathbb{P}(Z > 1) \\
&&&= 1-\Phi(1) \\
&&&\approx 15.9\%
\end{align*}