# Main References & Credits

Dalal, Siddhartha R., Edward B. Fowlkes, and Bruce Hoadley. 1989. “Risk Analysis of the Space Shuttle: Pre-Challenger Prediction of Failure.” Journal of the American Statistical Association 84 (408): 945–57. https://doi.org/10.1080/01621459.1989.10478858.
FISHER, R. A. 1936. “THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS.” Annals of Eugenics 7 (2): 179–88. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x.
Presidential Commission on the Space Shuttle Challenger Accident. 1986. Report of the Presidential Commission on the Space Shuttle Challenger Accident (Vols. 1 & 2). Washington, DC. http://history.nasa.gov/rogersrep/genindex.htm.

1. Also known as SSE: Sum of Squared Errors.↩︎

2. They are unique and always exist. They can be obtained by solving $$\frac{\partial}{\partial \beta_0}\text{RSS}(\beta_0,\beta_1)=0$$ and $$\frac{\partial}{\partial \beta_1}\text{RSS}(\beta_0,\beta_1)=0$$.↩︎

3. If $$\beta_1 = 0$$ this means $$COV(X,Y)=0$$. Remember that if the covariance is null that doesn’t necessarily means that $$X$$ and $$Y$$ are independent, this means there is no linear relationship between them, they are maybe independents or they have other type of relationships.↩︎

4. Recall that SSR is different from RSS (Residual Sum of Squares)↩︎

5. Recall that SSE and RSS (for $$(\hat \beta_0,\hat \beta_1)$$) are just different names for referring to the same quantity: $$\text{SSE}=\sum_{i=1}^n\left(Y_i-\hat Y_i\right)^2=\sum_{i=1}^n\left(Y_i-\hat \beta_0-\hat \beta_1X_i\right)^2=\mathrm{RSS}\left(\hat \beta_0,\hat \beta_1\right)$$.↩︎

6. The $$F_{n,m}$$ distribution arises as the quotient of two independent random variables $$\chi^2_n$$ and $$\chi^2_m$$, $$\frac{\chi^2_n/n}{\chi^2_m/m}$$.↩︎

7. Important to be sure that $$\hat{\beta}$$ is minimising RSS.↩︎

8. Recal that ESS is the explained sum of squares, ESS = TSS - RSS.↩︎

9. More complex – included here just for clarification of the anova’s output.↩︎

10. Recall that $$R^2 = 1- \frac{\text{RSS}}{\text{TSS}}$$↩︎

11. It is defined as $$R_{adj}^2 = 1- \frac{\text{RSS}/(n-p-1)}{\text{TSS}/(n-1)} = 1- \frac{\text{RSS}}{\text{TSS}}\times\frac{n-1}{n-p-1}$$↩︎

12. in the formula, $$\log$$ is the natural logarithm $$\ln$$↩︎

13. Old Faithful, is a hydrothermal geyser in Yellowstone National Park in the state of Wyoming, U.S.A., and is a popular tourist attraction. Its name stems from the supposed regularity of its eruptions. The data set comprises 272 observations, each of which represents a single eruption and contains two variables corresponding to the duration in minutes of the eruption, and the time until the next eruption, also in minutes.↩︎

14. Source: the famous MOOC Statistical Learning↩︎

15. Source: Trevor Hastie’s website↩︎

16. Source: Marvin Wright’s talk from Why R? 2019↩︎

17. An Introduction to Recursive Partitioning Using the rpart Routines - Details of the rpart package.↩︎

18. rpart.plot Package - Detailed manual on plotting with rpart using the rpart.plot package.↩︎

19. For classification a suggestion is mtry = $$\sqrt{p}$$.↩︎

20. generalized boosted models package↩︎

21. For classification, the suggested mtry for a random forest is $$\sqrt{p}$$.↩︎

22. Old Faithful, is a hydrothermal geyser in Yellowstone National Park in the state of Wyoming, U.S.A., and is a popular tourist attraction. Its name stems from the supposed regularity of its eruptions. The data set comprises 272 observations, each of which represents a single eruption and contains two variables corresponding to the duration in minutes of the eruption, and the time until the next eruption, also in minutes.↩︎

23. Made by Joseph J. Allaire https://github.com/jjallaire↩︎