6 Exercises 6

Exercise 6.1 The manager of an orange juice canning company wants to compare the performance of two canning lines at her plant. Since line \(1\) is relatively new, she expects it to produce, on average, a greater number of cases per day than line \(2\), which is older. Normal distribution is considered a good model for this variable. Ten production days are randomly selected for each chain. According to these data, \(\overline{x}_1=824.9\) cases per day and \(\overline{x}_2=818.6\) cases per day. We know from experience of operating this type of equipment that \(\sigma_1^2=40\) and \(\sigma_2^2=50\). Can the manager favor the 1st chain at \(\alpha = 5\%\) error threshold?

Exercise 6.2 A company producing wind-generated electricity wanted to compare the efficiency of two types of wind turbine: a two-bladed turbine (E2p) and a three-bladed turbine (E3p). To do this, we installed a wind turbine of each type on the same wind farm, and recorded the power output of each turbine (in kW) every 10 minutes.

In order to compare the output of the wind turbines, the statistical engineer randomly selected the following data from the database, independently for each turbine \(9\) power (in kW) from the database:

E2p	5	18	19	11	6	19	20	22	17
E3p	2	22	28	12	6	18	29	21	24

Define the two random variables under study.
Give a point estimate of the average power of each wind turbine.
Give a point estimate of the variance of the power of each wind turbine.
Give a \(95\%\) confidence interval for the mean power of each wind turbine.
Can we assume that the powers of the two turbines have the same variance?
Can we state, with a risk of error of \(1\%\), that the average power of the 3-bladed wind turbine is greater than the average power of the 2-bladed wind turbine?
Based on this study, can you recommend a particular type of wind turbine to the company?

Exercise 6.3 A manufacturer uses machines from two different manufacturers for his production. After six months’ use, he notes that out of \(80\) machines of type \(A\), \(50\) have never broken down, whereas for type \(B\) the proportion is \(40\) out of \(60\). Can these two types of machine be considered equivalent at the error threshold \(\alpha = 5\%\)?

Lab

We propose to study the weight of female octopuses. We will construct confidence intervals for the mean and variance of this variable, using the data file octopus.csv.

Retrieve the file poulpe.csv and load it into your session (R or Python).
Use the str() function to describe the structure of the imported data.

We wish to test the equality of the unknown theoretical averages of the female (\(\mu_1\)) and male (\(\mu_2\)) octopus weights, with a first order error set at \(5\%\).

Compare the two sub-populations graphically. (You can display two whisker boxes corresponding to weights according to octopus sex).
Estimate basic statistics (mean, standard deviation, etc.) by sub-population.
Perform a hypothesis test comparing two means using the t.test() function.

Tip

\(p\)-value

In practice, rather than calculating the critical region as a function of \(\alpha\), we prefer to give a critical threshold called \(p\)-value, which is the largest value of \(\alpha\) that would lead us not to reject \(H_0\). This information enables the reader to conclude that \(H_0\) should be accepted for any first-species risk \(\alpha \leq p\text{-value}\), and rejected for any \(\alpha > p\text{-value}\).