class: center, middle, inverse, title-slide # Regression and correlation --- layout: true <div class="my-footer"> <span> by Dr. Lucy D'Agostino McGowan </span> </div> --- # Partitioning variability ![](img/06/partitioning.png) --- # Why? * `\(\Huge y − \bar{y} = (\hat{y} − \bar{y}) + (y − \hat{y})\)` -- * `\(\Large \sum(y − \bar{y})^2 = \sum(\hat{y} − \bar{y})^2 + \sum(y − \hat{y})^2\)` -- * ## SSTotal = SSModel + SSE --- class: middle # coefficient of determination Often referred to as `\(\color{blue}r^2\)`, it is the fraction of the response variability that is explained by the model. --- ## Coefficient of determination * `\(\Large r^2 = \frac{\textrm{Variability explained by the model}}{\textrm{Total variability in } y}\)` -- * `\(\Large r^2 = \frac{\textrm{SSModel}}{\textrm{SSTotal}}\)` -- * `\(\Large r^2 = \frac{\sum(\hat{y} - \bar{y})^2}{\sum(y-\bar{y})^2}\)` -- * `\(\Large r^2 = \frac{\textrm{SSTotal − SSE}}{\textrm{SSTotal}}\)` -- * `\(\Large r^2 = 1 - \frac{\textrm{SSE}}{\textrm{SSTotal}}\)` --- ## Let's do it in R! ```r lm(Weight ~ WingLength, data = Sparrows) %>% glance() ``` ``` ## # A tibble: 1 x 11 ## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC ## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> *## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419. ## # … with 2 more variables: deviance <dbl>, df.residual <int> ``` 61% of the variation in "Weight" is explained by "Wing Length". ---