4  Exercises 4

Exercise 4.1 For a normal population with known variance \(\sigma^2\), answer the following questions:

  1. What is the confidence level for the interval \(\overline{x} - 2.14 \sigma/\sqrt{n} \leq \mu \leq \overline{x} + 2.14 \sigma/\sqrt{n}\) ?
  2. What is the confidence level for the interval \(\overline{x} - 1.85 \sigma/\sqrt{n} \leq \mu \leq \overline{x} + 1.85 \sigma/\sqrt{n}\) ?
  3. What is the confidence level for the interval \(\mu \leq \overline{x} + 1.96 \sigma/\sqrt{n}\) ?

Exercise 4.2 A confidence interval estimate is desired for the gain in a circuit on a semiconductor device1. Assume that gain is normally distributed with standard deviation \(\sigma = 20\).

  1. Describe the random experiment, the universe and the random variable \(X\).

  2. Let \(n\) be the sample size and \(\alpha\) the error. Give the expression of the confidence interval of \(m\) in function of \(n\), \(\sigma\) and \(\alpha\).

  3. How should the length of the confidence interval vary with respect to sample size and confidence level?

  4. Confirm your answer by giving the confidence interval of \(m\) for the following cases:

    • CI at \(95\%\) where \(n=10\) and \(\hat{m} = 1000\).
    • CI at \(95\%\) where \(n=25\) and \(\hat{m} = 1000\).
    • CI at \(99\%\) where \(n=10\) and \(\hat{m} = 1000\).
    • CI at \(99\%\) where \(n=25\) and \(\hat{m} = 1000\).
  5. What must be the sample size \(n\) if the length of the confidence interval at \(95\%\) is \(4\)?

Exercise 4.3 In the French population, the percentage of individuals whose blood is Rh-negative is \(15%\). In a representative sample of \(200\) French, we observe that \(44\) people are Rh-negative. Give a confidence interval at \(99\%\) of the proportion of Rh-negative French.

Exercise 4.4 The Bureau of Meteorology of the Australian Government provided the mean annual rainfall (in millimeters) in Australia 1983–2002 as follows 2

499.2 499.3
555.2 340.6
398.8 522.8
391.9 469.9
453.4 527.2
459.8 565.5
483.7 584.1
417.6 727.3
469.2 558.6
452.4 338.6

Assuming that the average annual precipitation follows a Normal distribution of unknown parameters3. Construct a 95% confidence interval for the mean annual rainfall.

Exercise 4.5 The percentage of titanium in an alloy used in aerospace castings is measured in 51 randomly selected parts.

The sample standard deviation is \(s = 0.37\). Construct a 95% two-sided confidence interval for \(\sigma\).

Lab

Exercise 4.6 (CI for the parameter \(\lambda\) of the exponential distribution) Let \(X_1,\ldots,X_n\) be a random sample of the exponential distribution with parameter \(\lambda\). The MLE of \(\lambda\) is \(T = n/(X_1+\ldots+X_n)\). The random variable \(Y=2\lambda X\) follows an exponential distribution with parameter \(1/2\), which is also a \(\chi^2(2)\) distribution (proof ). Let \(Y_i = 2\lambda X_i\) be an iid random sample with a \(\chi^2(2)\) distribution.

Let \(T=2\lambda \sum_{i=1}^n X_i = \sum_{i=1}^n Y_i\). So \(T\) follows the law \(\chi^2(2n)\).

  1. Construct a CI for \(\lambda\) (on paper).
  2. A theoretical model suggests that the duration of phone calls follows an exponential distribution with parameter \(\lambda\). A random sample of \(n = 10\) call durations (in minutes):
2.84 2.37 7.52 2.76 3.83 1.32 8.43 2.25 1.63 0.27

Calculate the CI of \(\lambda\) based on this sample and the CI of the average call duration based on this sample.

Simulation

  1. Now choose a value of \(\lambda\). Draw \(100\) samples of size \(20\) from an exponential distribution of parameter \(\lambda\).
  2. Calculate the \(100\) values taken by the estimator \(T\) on these samples.
  3. For \(\alpha=0.1\), then \(0.05\), then \(0.01\), calculate the values taken by the \(100\) two-sided confidence intervals of level \(1-\alpha\) for \(\lambda\).
  4. Calculate in percent the number of intervals that contain the value of \(\lambda\).
  5. Plot the intervals as superimposed horizontal segments, and the true value of the \(\lambda\) parameter as a vertical red line. (in you can use matplot() and abline(), You can also color the CIs containing or not containing \(p\) differently)

Exercise 4.7 (CI for a proportion and effect of confidence) Let \(X_1,\ldots,X_n\) be a random sample of Bernoulli’s distribution of parameter \(p\). The MME and MLE of \(p\) is \(\hat{p}_n = \overline{X}_n=\frac{1}{n} \sum_{i=1}^n X_i\). The approximate \(1-\alpha\) confidence interval for \(p\) if \(np > 5\) and \(n(1-p)>5\) is

\[IC_{1-\alpha}(p)= \Big[\hat{p}_n + z_{\alpha/2} \sqrt{\frac{\hat{p}_n(1-\hat{p}_n)}{n}}, \hat{p}_n - z_{\alpha/2} \sqrt{\frac{\hat{p}_n(1-\hat{p}_n)}{n}}\Big]\]

where \(z_{\alpha/2}\) denotes the quantile of order \(\alpha/2\) of a Normal distribution.

  1. Choose a value of \(p\), strictly between \(0\) and \(1\). Draw \(100\) samples of size \(100\) from the Bernoulli distribution of parameter \(p\).
  2. For \(\alpha=0.1\), then \(0.05\), then \(0.01\), calculate the values taken by the \(100\) two-sided confidence intervals of level \(1-\alpha\) for \(p\).
  3. Calculate as a percentage the number of intervals that contain the value of \(p\). Interpret.
  4. Plot the intervals as overlapping horizontal segments, and the true value of parameter \(p\) as a vertical red line.
  5. Repeat the previous simulation, drawing \(100\) samples of size \(20\) instead of \(100\) and choosing \(p\) between \(0\) and \(0.2\). For \(\alpha=0.05\), calculate the number of intervals that do not contain the value of \(p\). Interpret.

  1. In electronics, gain is a measure of the ability of a two-port circuit (often an amplifier) to increase the power or amplitude of a signal from the input to the output port (source: Wikipedia).↩︎

  2. Lien↩︎

  3. We can verify this assumption, i.e. the normality of data with the figure known as Normal probability plot, or “Droite de Henry” or QQ-plot. We can draw it in R using the function qqnorm()↩︎