12 problems found
A batch of \(N\) USB sticks is to be used on a network. Each stick has the same unknown probability \(p\) of being infected with a virus. Each stick is infected, or not, independently of the others. The network manager decides on an integer value of \(T\) with \(0 \leqslant T < N\). If \(T = 0\) no testing takes place and the \(N\) sticks are used on the network, but if \(T > 0\), the batch is subject to the following procedure.
Let \(X\) be a random variable with a Laplace distribution, so that its probability density function is given by \[ \f(x) = \frac12 \e^{-\vert x \vert }\;, \text{ \(-\infty < x < \infty \)}. \tag{\(*\)} \] Sketch \(\f(x)\). Show that its moment generating function \({\rm M}_X(\theta)\) is given by \({\rm M}_X(\theta)= (1-\theta^2)^{-1}\) and hence find the variance of \(X\). A frog is jumping up and down, attempting to land on the same spot each time. In fact, in each of \(n\) successive jumps he always lands on a fixed straight line but when he lands from the \(i\)th jump (\(i=1\,,2\,,\ldots\,,n\)) his displacement from the point from which he jumped is \(X_i\,\)cm, where \(X_i\) has the distribution \((*)\). His displacement from his starting point after \(n\) jumps is \(Y\,\)cm (so that \(Y=\sum\limits_{i=1}^n X_i\)). Each jump is independent of the others. Obtain the moment generating function for \(Y/ \sqrt {2n}\) and, by considering its logarithm, show that this moment generating function tends to \(\exp(\frac12\theta^2)\) as \(n\to\infty\). Given that \(\exp(\frac12\theta^2)\) is the moment generating function of the standard Normal random variable, estimate the least number of jumps such that there is a \(5\%\) chance that the frog lands 25 cm or more from his starting point.
Solution:
The number of printing errors on any page of a large book of \(N\) pages is modelled by a Poisson variate with parameter \(\lambda\) and is statistically independent of the number of printing errors on any other page. The number of pages in a random sample of \(n\) pages (where \(n\) is much smaller than \(N\) and \(n\ge2\)) which contain fewer than two errors is denoted by \(Y\). Show that \(\P(Y=k) = \binom n k p^kq^{n-k}\) where \(p=(1+\lambda)e^{-\lambda}\) and \(q=1-p\,\). Show also that, if \(\lambda\) is sufficiently small,
Solution: First notice that the the probability a page contains fewer than two errors is \(\mathbb{P}(X < 2)\) where \(X \sim Po(\lambda)\), ie \(\mathbb{P}(X<2) = e^{-\lambda} + \lambda e^{-\lambda} = (1+\lambda)e^{-\lambda}\). Therefore the number of pages \(Y\) with fewer than two errors out of our sample of \(n\) is \(Bin(n, p)\) where \(p\) is as before. ie \(\mathbb{P}(Y = k) = \binom{n}{k} p^kq^{n-k}\).
Solution: \begin{align*} \P(X \leq 0.8) &= \P(X_1 \leq 0.8,X_2 \leq 0.8,X_3 \leq 0.8) \\ &= 0.8^3 \\ &= 0.512 \end{align*} \begin{align*} && \P(X < c) &= c^3 \\ \Rightarrow && f_X(x) &= 3x^2 \\ \Rightarrow && \E[X] &= \int_0^1 x \cdot (3x^2) \, dx \\ && &= \left [ \frac{3}{4}x^4 \right]_0^1 \\ &&&= \frac{3}{4} \end{align*} \(X\) is distributed the maximum of \(N\) numbers on \([0,a]\). \begin{align*} H_0 : & x= 1 \\ H_1 : & x < 1 \end{align*} \begin{align*} &&\P(X < c) &= c^N \\ &&&= \frac1{20} \\ \Rightarrow && N &= -\frac{\log(20)}{\log(c)} \end{align*} where \(c = 0.8\), we have \begin{align*} N &= \frac{\log(20)}{\log(5/4)} \\ &= \frac{\log(5)+\log(4)}{\log(5)-\log(4)} \\ &= \frac{ \frac{\log(5)}{\log(4)}+1}{\frac{\log(5)}{\log(4)} - 1} \end{align*} \begin{align*} && 2^{10} &\approx 10^{3} \\ && 10\log(2) &\approx 3 (\log(5) + \log(2)) \\ && 7\log(2) &\approx 3 \log(5) \\ && \frac{\log(5)}{2\log(2)} &\approx \frac{7}{6} \end{align*} \begin{align*} &= \frac{ \frac{\log(5)}{\log(4)}+1}{\frac{\log(5)}{\log(4)} - 1} &= \frac{\frac{7}{6} + 1}{\frac{7}{6} -1} \\ &= 13 \end{align*} Since \(2^{10} > 10^3\) then \(N=14\) is the value we seek. \(\P(X < 0.8 | a= 0.8) = 1\) \(\P(X < 0.8 | a= 0.9, N=14) = \frac{8^{14}}{9^{14}}\)
In this question, \(\Phi(z)\) is the cumulative distribution function of a standard normal random variable. A random variable is known to have a Normal distribution with mean \(\mu\) and standard deviation either \(\sigma_0\) or \(\sigma_1\), where \(\sigma_0 < \sigma_1\,\). The mean, \(\overline{X}\), of a random sample of \(n\) values of \(X\) is to be used to test the hypothesis \(\mathrm{H}_0: \sigma = \sigma_0\) against the alternative \(\mathrm{H}_1: \sigma = \sigma_1\,\). Explain carefully why it is appropriate to use a two sided test of the form: accept \(\mathrm{H}_0\) if \(\mu - c < \overline{X} < \mu+c\,\), otherwise accept \(\mathrm{H}_1\). Given that the probability of accepting \(\mathrm{H}_1\) when \(\mathrm{H}_0\) is true is \(\alpha\), determine \(c\) in terms of \(n\), \(\sigma_0\) and \(z_{\alpha}\), where \(z_\alpha \) is defined by \(\displaystyle\Phi(z_{\alpha}) = 1 - \tfrac{1}{2}\alpha\). The probability of accepting \(\mathrm{H}_0\) when \(\mathrm{H}_1\) is true is denoted by \(\beta\). Show that \(\beta\) is independent of \(n\). Given that \(\Phi(1.960)\approx 0.975\) and that \(\Phi(0.063) \approx 0.525\,\), determine, approximately, the minimum value of \(\displaystyle \frac{\sigma_1}{\sigma_0}\) if \(\alpha\) and \(\beta\) are both to be less than \(0.05\,\).
Solution: If \(\sigma\) is smaller we should expect our sample to have a mean closer to the true mean. Therefore we should use a two sided test which accepts \(\mathrm{H}_0\) if the mean is very close to the true mean. Suppose \(\textrm{H}_0\) is true, ie \(\sigma = \sigma_0\), then note that \(X \sim N(\mu, \frac{\sigma_0^2}{n})\) \begin{align*} && 1-\alpha &= \mathbb{P}(\mu - c < X < \mu + c) \\ &&&= \mathbb{P}(\mu - c < \frac{\sigma_0}{\sqrt{n}} Z + \mu < \mu + c) \\ &&&= \mathbb{P}(- \frac{c\sqrt{n}}{\sigma_0} < Z<\frac{\sqrt{n}c}{\sigma_0}) \\ &&&= \mathbb{P}(Z<\frac{\sqrt{n}c}{\sigma_0}) -\mathbb{P}( Z<-\frac{\sqrt{n}c}{\sigma_0}) \\ &&&= \mathbb{P}(Z<\frac{\sqrt{n}c}{\sigma_0}) -(1-\mathbb{P}( Z<\frac{\sqrt{n}c}{\sigma_0})) \\ &&&= 2\mathbb{P}(Z<\frac{\sqrt{n}c}{\sigma_0})-1 \\ \Rightarrow && \Phi(\frac{\sqrt{n}c}{\sigma_0})&=1 - \tfrac12 \alpha \\ \Rightarrow && \frac{\sqrt{n}c}{\sigma_0} &= z_{\alpha} \\ && c &= \frac{\sigma_0 z_{\alpha}}{\sqrt{n}} \end{align*} Under \(\mathrm{H}_1\), \(\sigma = \sigma_1\) so \begin{align*} && \beta &= \mathbb{P}(\mu - c < X < \mu + c) \\ &&&= \mathbb{P}(-\frac{c\sqrt{n}}{\sigma_1} < Z < \frac{\sqrt{n}c}{\sigma_1}) \\ &&&= \mathbb{P}(-\frac{\sigma_0}{\sigma_1} z_{\alpha}< Z < \frac{\sigma_0}{\sigma_1} z_{\alpha}) \\ &&&= 2\Phi(\frac{\sigma_0}{\sigma_1} z_{\alpha})-1 \end{align*} which does not depend on \(n\). Suppose both \(\alpha<0.05\) and \(\beta<0.05\), then \(z_{\alpha} > 1.96\) and \(\Phi(\frac{\sigma_0}{\sigma_1}1.96)<0.525 \Rightarrow \frac{\sigma_0}{\sigma_1}1.96 < 0.063 \Rightarrow \frac{\sigma_1}{\sigma_0} > \frac{1.96}{0.063} = 31.1 \) so the ratio of variances needs to be larger than \(31.1\).
The probability of throwing a 6 with a biased die is \(p\,\). It is known that \(p\) is equal to one or other of the numbers \(A\) and \(B\) where \(0 < A < B < 1 \,\). Accordingly the following statistical test of the hypothesis \(H_0: \,p=B\) against the alternative hypothesis \(H_1: \,p=A\) is performed. The die is thrown repeatedly until a 6 is obtained. Then if \(X\) is the total number of throws, \(H_0\) is accepted if \(X \le M\,\), where \(M\) is a given positive integer; otherwise \(H_1\) is accepted. Let \({\alpha}\) be the probability that \(H_1\) is accepted if \(H_0\) is true, and let \({\beta}\) be the probability that \(H_0\) is accepted if \(H_1\) is true. Show that \({\beta} = 1- {\alpha}^K,\) where \(K\) is independent of \(M\) and is to be determined in terms of \(A\) and \(B\,\). Sketch the graph of \({\beta}\) against \({\alpha}\,\).
Solution: \(X \sim Geo(p)\). \(\alpha = \mathbb{P}(X > M | p = B) = (1-B)^{M}\) \(\beta = \mathbb{P}(X \leq M | p = A) = 1 - \mathbb{P}(X > M | p = A) = 1 - (1-A)^{M}\) \begin{align*} \ln \alpha &= M \ln(1-B) \\ \ln (1-\beta) &= M \ln(1-A) \\ \frac{\ln \alpha}{\ln (1-\beta)} &= \frac{\ln(1-B)}{\ln(1-A)} \\ \ln(1-\beta) &= \ln \alpha \frac{\ln (1-A)}{\ln(1-B)} \\ \beta &= 1- \alpha^{ \frac{\ln (1-A)}{\ln(1-B)} } \end{align*} and \(K = \frac{\ln (1-A)}{\ln(1-B)} \) Since \(0 < A < B < 1\) we must have that \(0 < 1 - B < 1-A < 1\) and \(\ln(1-B) < \ln(1-A) < 0\) so \(0 < K < 1\)
Suppose \(X\) is a random variable with probability density \[ \mathrm{f}(x)=Ax^{2}\exp(-x^{2}/2) \] for \(-\infty < x < \infty.\) Find \(A\). You belong to a group of scientists who believe that the outcome of a certain experiment is a random variable with the probability density just given, while other scientists believe that the probability density is the same except with different mean (i.e. the probability density is \(\mathrm{f}(x-\mu)\) with \(\mu\neq0\)). In each of the following two cases decide whether the result given would shake your faith in your hypothesis, and justify your answer.
Solution: Let \(Z \sim N(0,1)\), with a pdf of \(f(x) = \frac{1}{\sqrt{2\pi}} \exp(-x^2/2)\) \begin{align*} && 1 &= \int_{-\infty}^\infty Ax^2 \exp(-x^2/2) \d x \\ &&&= A\sqrt{2\pi} \int_{-\infty}^\infty x^2 \frac{1}{\sqrt{2\pi}} \exp(-x^2/2) \d x \\ &&&= A\sqrt{2\pi} \E[Z^2] = A\sqrt{2\pi} \\ \Rightarrow && A &= \frac{1}{\sqrt{2\pi}} \end{align*}
The average number of pedestrians killed annually in road accidents in Poldavia during the period 1974-1989 was 1080 and the average number killed annually in commercial flight accidents during the same period was 180. Discuss the following newspaper headlines which appeared in 1991. (The percentage figures in square brackets give a rough indication of the weight of marks attached to each discussion.)
Solution:
A bus is supposed to stop outside my house every hour on the hour. From long observation I know that a bus will always arrive some time between 10 minutes before and ten minutes after the hour. The probability it arrives at a given instant increases linearly (from zero at 10 minutes before the hour) up to a maximum value at the hour, and then decreases linearly at the same rate after the hour. Obtain the probability density function of \(T\), the time in minutes after the scheduled time at which a bus arrives. If I get up when my alarm clock goes off, I arrive at the bus stop at 7.55am. However, with probability 0.5, I doze for 3 minutes before it rings again. In that case with probability 0.8 I get up then and reach the bus stop at 7.58am, or, with probability 0.2, I sleep a little longer, not reaching the stop until 8.02am. What is the probability that I catch a bus by 8.10am? I buy a louder alarm clock which ensures that I reach the stop at exactly the same time each morning. This clock keeps perfect time, but may be set to an incorrect time. If it is correct, the alarm goes off so that I should reach the stop at 7.55am. After 100 mornings I find that I have had to wait for a bus until after 9am (according to the new clock) on 5 occasions. Is this evidence that the new clock is incorrectly set? {[}The time of arrival of different buses are independent of each other.{]}
Solution: The probability density function will look like a triangle with base \(20\) minutes and therefore height \(\frac{1}{10}\) per minute, ie: \begin{align*} f_T(t) &= \begin{cases} \frac{1}{100}(t+10) & \text{if } -10 \leq t \leq 0 \\ \frac{1}{100}(10-t) & \text{if } 0 \leq t \leq 10 \\ 0 & \text{otherwise} \end{cases} \end{align*} \begin{align*} \mathbb{P}(\text{catch bus}) &=0.5 \mathbb{P}(\text{bus arrives after 7:55})+0.4 \mathbb{P}(\text{bus arrives after 7:58}) + 0.1 \mathbb{P}(\text{bus arrives after 8:02}) \\ &= \frac12 \cdot \left (1 - \frac18 \right) + \frac{2}{5} \cdot \left ( 1 - \frac{4^2}{5^2} \cdot \frac{1}{2} \right) + \frac{1}{10} \cdot \frac{4^2}{5^2} \cdot \frac12 \\ &= \frac{1\,483}{2\,000} \\ &\approx 74\% \end{align*} \begin{align*} \mathbb{P}(\text{catch bus}) &= \mathbb{P}(\text{bus arrives after 7:55}) \mathbb{P}(\text{catch next bus by 9:00}) \\ &= \frac78 + \frac18 \cdot \frac12 \\ &= \frac{15}{16} \end{align*} He should expect to miss \(6.25\) buses, so missing \(5\) seems about right. (Using a binomial calculation, seeing 5 or fewer buses is ~\(40\%\) which isn't suspicious).
The prevailing winds blow in a constant southerly direction from an enchanted castle. Each year, according to an ancient tradition, a princess releases 96 magic seeds from the castle, which are carried south by the wind before falling to rest. South of the castle lies one league of grassy parkland, then one league of lake, then one league of farmland, and finally the sea. If a seed falls on land it will immediately grow into a fever tree. (Fever trees do not grow in water). Seeds are blown independently of each other. The random variable \(L\) is the distance in leagues south of the castle at which a seed falls to rest (either on land or water). It is known that the probability density function \(\mathrm{f}\) of \(L\) is given by \[ \mathrm{f}(x)=\begin{cases} \frac{1}{2}-\frac{1}{8}x & \mbox{ for }0\leqslant x\leqslant4,\\ 0 & \mbox{ otherwise.} \end{cases} \] What is the mean number of fever trees which begin to grow each year?
Solution: \begin{align*} \mathbb{P}(\text{fever tree grows}) &= \mathbb{P}(0 \leq L \leq 1) + \mathbb{P}(2 \leq L \leq 3) \\ &= \int_0^1 \frac12 -\frac18 x \d x + \int_2^3 \frac12 - \frac18 x \d x \\ &= \left [\frac12 x - \frac1{16}x^2 \right]_0^1+ \left [\frac12 x - \frac1{16}x^2 \right]_2^3 \\ &= \frac12 - \frac1{16}+\frac32-\frac9{16} - 1 + \frac{4}{16} \\ &= \frac58 \end{align*} The expected number of fever trees is just \(96 \cdot \frac58 = 60\).
It is believed that the population of Ruritania can be described as follows:
Solution:
In Fridge football, each team scores two points for a goal and one point for a foul committed by the opposing team. In each game, for each team, the probability that the team scores \(n\) goals is \(\left(3-\left|2-n\right|\right)/9\) for \(0\leqslant n\leqslant4\) and zero otherwise, while the number of fouls committed against it will with equal probability be one of the numbers from \(0\) to \(9\) inclusive. The numbers of goals and fouls of each team are mutually independent. What is the probability that in some game a particular team gains more than half its points from fouls? In response to criticisms that the game is boring and violent, the ruling body increases the number of penalty points awarded for a foul, in the hope that this will cause large numbers of fouls to be less probable. During the season following the rule change, 150 games are played and on 12 occasions (out of 300) a team committed 9 fouls. Is this good evidence of a change in the probability distribution of the number of fouls? Justify your answer.
Solution: \begin{array}{c|c|c|c} k & \P(k \text{ goals}) & \P(\geq 2k+1 \text{ fouls}) & \P(k \text{ goals and } \geq 2k+1 \text{ fouls}) \\ \hline 0 & \frac{3-|2|}{9} = \frac19 & \frac{9}{10} & \frac{9}{90}\\ 1 & \frac{3-|2-1|}{9} = \frac29 & \frac{7}{10} & \frac{14}{90} \\ 2 & \frac{3-|2-2|}{9} = \frac39 & \frac{5}{10} & \frac{15}{90} \\ 3 & \frac{3-|2-3|}{9} = \frac29 & \frac{3}{10} & \frac{6}{90} \\ 4 & \frac{3-|2-4|}{9} = \frac19 & \frac{1}{10} & \frac{1}{90} \\ \hline &&& \frac{9+14+15+6+1}{90} = \frac12 \end{array} The probability a team scores more than half its points from fouls is \(\frac12\). Letting \(X\) be the number of times a team committed \(9\) fouls, then \(X \sim B(300, p)\). Consider two hypotheses: \(H_0: p = \frac1{10}\) \(H_1: p < \frac1{10}\) Under \(H_0\), we are interested in \(\P(X \leq 9)\). Since \(300 \frac{1}{10} > 5\) it is appropriate to use a normal approximation, \(N(30, 27)\). Therefore, \begin{align*} && \P(X \leq 9) &\approx \P(3\sqrt{3}Z + 30 \leq 9.5) \\ &&&= \P( Z \leq \frac{9.5-30}{3\sqrt{3}}) \\ &&&= \P(Z \leq \frac{-20.5}{3\sqrt{3}}) \\ &&&< \P(Z \leq -\frac{7}{2}) \end{align*} Which is very small. Therefore there is good evidence to believe there has been a change in the number of fouls.