response | predictor(s) | model |
---|---|---|
quantitative | one quantitative | simple linear regression |
quantitative | two or more (of either kind) | multiple linear regression |
binary | one (of either kind) | simple logistic regression |
binary | two or more (of either kind) | multiple logistic regression |
response | predictor(s) | model |
---|---|---|
quantitative | one quantitative | simple linear regression |
quantitative | two or more (of either kind) | multiple linear regression |
binary | one (of either kind) | simple logistic regression |
binary | two or more (of either kind) | multiple logistic regression |
variables | predictor | ordinary regression | logistic regression |
---|---|---|---|
one: \(x\) | \(\beta_0 + \beta_1 x\) | Response \(y\) | \(\textrm{logit}(\pi)=\log\left(\frac{\pi}{1-\pi}\right)\) |
several: \(x_1,x_2,\dots,x_k\) | \(\beta_0 + \beta_1x_1 + \dots+\beta_kx_k\) | Response \(y\) | \(\textrm{logit}(\pi)=\log\left(\frac{\pi}{1-\pi}\right)\) |
Form | Model |
---|---|
Logit form | \(\log\left(\frac{\pi}{1-\pi}\right) = \beta_0 + \beta_1x_1 + \beta_2x_2 + \dots \beta_kx_k\) |
Probability form | \(\Large\pi = \frac{e^{\beta_0 + \beta_1x_1 + \beta_2x_2 + \dots \beta_kx_k}}{1+e^{\beta_0 + \beta_1x_1 + \beta_2x_2 + \dots \beta_kx_k}}\) |
data(MedGPA)glm(Acceptance ~ MCAT + GPA, data = MedGPA, family = "binomial") %>% tidy(conf.int = TRUE)
## # A tibble: 3 x 7## term estimate std.error statistic p.value conf.low conf.high## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) -22.4 6.45 -3.47 0.000527 -36.9 -11.2 ## 2 MCAT 0.165 0.103 1.59 0.111 -0.0260 0.383## 3 GPA 4.68 1.64 2.85 0.00439 1.74 8.27
What does this do?
glm(Acceptance ~ MCAT + GPA, data = MedGPA, family = "binomial") %>% tidy(conf.int = TRUE)
## # A tibble: 3 x 7## term estimate std.error statistic p.value conf.low conf.high## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) -22.4 6.45 -3.47 0.000527 -36.9 -11.2 ## 2 MCAT 0.165 0.103 1.59 0.111 -0.0260 0.383## 3 GPA 4.68 1.64 2.85 0.00439 1.74 8.27
What does this do?
glm(Acceptance ~ MCAT + GPA, data = MedGPA, family = "binomial") %>% tidy(conf.int = TRUE)
## # A tibble: 3 x 7## term estimate std.error statistic p.value conf.low conf.high## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) -22.4 6.45 -3.47 0.000527 -36.9 -11.2 ## 2 MCAT 0.165 0.103 1.59 0.111 -0.0260 0.383## 3 GPA 4.68 1.64 2.85 0.00439 1.74 8.27
What does this do?
glm(Acceptance ~ MCAT + GPA, data = MedGPA, family = "binomial") %>% tidy(conf.int = TRUE) %>% kable()
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | -22.373 | 6.454 | -3.47 | 0.001 | -36.894 | -11.235 |
MCAT | 0.165 | 0.103 | 1.59 | 0.111 | -0.026 | 0.383 |
GPA | 4.676 | 1.642 | 2.85 | 0.004 | 1.739 | 8.272 |
What are the assumptions of multiple logistic regression?
What are the assumptions of multiple logistic regression?
How do you determine whether the conditions are met?
How do you determine whether the conditions are met?
If I have two nested models, how do you think I can determine if the full model is significantly better than the reduced?
If I have two nested models, how do you think I can determine if the full model is significantly better than the reduced?
If I have two nested models, how do you think I can determine if the full model is significantly better than the reduced?
If I have two nested models, how do you think I can determine if the full model is significantly better than the reduced?
If I have two nested models, how do you think I can determine if the full model is significantly better than the reduced?
If I have two nested models, how do you think I can determine if the full model is significantly better than the reduced?
glm(Acceptance ~ GPA, data = MedGPA, family = binomial) %>% glance()
## # A tibble: 1 x 7## null.deviance df.null logLik AIC BIC deviance df.residual## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>## 1 75.8 54 -28.4 60.8 64.9 56.8 53
glm(Acceptance ~ GPA + MCAT, data = MedGPA, family = binomial) %>% glance()
## # A tibble: 1 x 7## null.deviance df.null logLik AIC BIC deviance df.residual## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>## 1 75.8 54 -27.0 60.0 66.0 54.0 52
56.83901 - 54.01419
## [1] 2.82
glm(Acceptance ~ GPA, data = MedGPA, family = binomial) %>% glance()
## # A tibble: 1 x 7## null.deviance df.null logLik AIC BIC deviance df.residual## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>## 1 75.8 54 -28.4 60.8 64.9 56.8 53
glm(Acceptance ~ GPA + MCAT, data = MedGPA, family = binomial) %>% glance()
## # A tibble: 1 x 7## null.deviance df.null logLik AIC BIC deviance df.residual## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>## 1 75.8 54 -27.0 60.0 66.0 54.0 52
pchisq(56.83901 - 54.01419 , df = 1, lower.tail = FALSE)
## [1] 0.0928
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>% glance()
## # A tibble: 1 x 7## null.deviance df.null logLik AIC BIC deviance df.residual## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>## 1 75.8 54 -28.4 60.8 64.9 56.8 53
glm(Acceptance ~ GPA + MCAT + Apps, data = MedGPA, family = "binomial") %>% glance()
## # A tibble: 1 x 7## null.deviance df.null logLik AIC BIC deviance df.residual## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>## 1 75.8 54 -26.8 61.7 69.7 53.7 51
pchisq(56.83901 - 53.68239, df = 2, lower.tail = FALSE)
## [1] 0.206
How do you interpret these \(\beta\) coefficients?
glm(Acceptance ~ MCAT + GPA, data = MedGPA, family = "binomial") %>% tidy(conf.int = TRUE)
## # A tibble: 3 x 7## term estimate std.error statistic p.value conf.low conf.high## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) -22.4 6.45 -3.47 0.000527 -36.9 -11.2 ## 2 MCAT 0.165 0.103 1.59 0.111 -0.0260 0.383## 3 GPA 4.68 1.64 2.85 0.00439 1.74 8.27
The coefficient for \(x\) is \(\hat\beta\) (95% CI: \(LB_\hat\beta, UB_\hat\beta\)). A one-unit increase in \(x\) yields a \(\hat\beta\) expected change in the log odds of y, holding all other variables constant.
The odds ratio for \(x\) is \(e^{\hat\beta}\) (95% CI: \(e^{LB_\hat\beta}, e^{UB_\hat\beta}\)). A one-unit increase in \(x\) yields a \(e^{\hat\beta}\)-fold expected change in the odds of y, holding all other variables constant.
Ordinary regression | Logistic regression | ||
---|---|---|---|
test or interval for \(\beta\) | \(t = \frac{\hat\beta}{SE_{\hat\beta}}\) | \(z = \frac{\hat\beta}{SE_{\hat\beta}}\) | |
t-distribution | z-distribution | ||
test for nested models | \(F = \frac{\Delta SSModel / p}{SSE_{full} / (n - k - 1)}\) | G = \(\Delta(-2\log\mathcal{L})\) | |
F-distribution | \(\chi^2\)-distribution |
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |