class: center, middle, inverse, title-slide

# Logistic regression inference

---

layout: true

<div class="my-footer">
<span>
by Dr. Lucy D'Agostino McGowan
</span>
</div>

---

# <i class="fas fa-laptop"></i> `Putt Length`

- Go to RStudio Cloud and open `Putt Length`

---

## Inference

* Even if your check of conditions convinces you that the Bernoulli (_spinner_) model is not appropriate, you can still use logistic regression for **description**, _and sometimes for prediction_

--
* If the outcomes **are** random and independent, you can also do inference! 
  * test hypotheses
  * construct confidence intervals
  
---

## Hypothesis test

* **null hypothesis** `\(H_0: \beta_1 = 0\)` 
* **alternative hypothesis** `\(H_A: \beta_1 \ne 0\)`

---

## Logistic regression test statistic

<br><br>

.center[
`\(\Huge z = \frac{\hat\beta_1}{SE_{\hat\beta_1}}\)`
]
---

## Logistic regression test statistic

.question[
How is this different from the test statistic for _linear regression_?
]
.center[
`\(\Huge z = \frac{\hat\beta_1}{SE_{\hat\beta_1}}\)`
]
---

## Logistic regression test statistic

.question[
How is this different from the test statistic for _linear regression_?
]
.center[
`\(\Huge \color{red}z = \frac{\hat\beta_1}{SE_{\hat\beta_1}}\)`
]

--

* The `\(z\)` denotes that this is a `\(z\)`-statistic

--
* What does this mean? **Instead of using a `\(t\)` distribution, we use a normal distribution to calculate the confidence intervals and p-values**
---

## Logistic regression confidence interval

.question[
What do you think goes in this blank to calculate a confidence interval (instead of `\(t^*\)` as it was for _linear regression_)?
]
.center[
`\(\Huge \hat\beta_1 \pm [\_^*] SE_{\hat\beta_1}\)`
]
---

## Logistic regression confidence interval

.question[
What do you think goes in this blank to calculate a confidence interval (instead of `\(t^*\)` as it was for _linear regression_)?
]
.center[
`\(\Huge \hat\beta_1 \pm [\color{red}z^*] SE_{\hat\beta_1}\)`
]

--

* `\(z^∗\)` is found using the normal distribution and the desired level of confidence

--
.small[

```r
qnorm(0.975)
```

```
## [1] 1.96
```
]

---

## Logistic regression confidence interval

.question[
Where are my degrees of freedom when calculating `\(z^*\)`?
]
.center[
`\(\Huge \hat\beta_1 \pm [\color{red}z^*] SE_{\hat\beta_1}\)`
]

* `\(z^∗\)` is found using the normal distribution and the desired level of confidence
.small[

```r
qnorm(0.975)
```

```
## [1] 1.96
```
]

--
* The normal distribution doesn't need to know your sample size _but it does rely on reasonably large sample_

---

## Logistic regression confidence interval

* `\(\hat\beta_1\)` measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the **odds ratio**?

.center[
`\(\Huge \hat\beta_1 \pm [\color{red}z^*] SE_{\hat\beta_1}\)`
]

---

## Logistic regression confidence interval

.question[
How do you convert log(odds) to odds?
]

* `\(\hat\beta_1\)` measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the **odds ratio**?

.center[
`\(\Huge \hat\beta_1 \pm [\color{red}z^*] SE_{\hat\beta_1}\)`
]

---

## Logistic regression confidence interval

.question[
How do you convert log(odds) to odds?
]

* `\(\hat\beta_1\)` measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the **odds ratio**?

.center[
`\(\Huge e^{\hat\beta_1 \pm [\color{red}z^*] SE_{\hat\beta_1}}\)`
]
---

## Let's try it in R!

.question[
We are interested in the relationship between Backpack weight and Back problems.
]
.small[

```r
data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
* tidy(exponentiate = TRUE, conf.int = TRUE)
```

```
## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13
```
]

--

* How do you interpret the Odds ratio?

--
  * A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems

---

## Let's try it in R!

.small[

```r
data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
* tidy(exponentiate = TRUE, conf.int = TRUE)
```

```
## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13
```
]

* How do you interpret the Odds ratio?
  * A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems
* What is my null hypothesis?

--
  * `\(H_0:\beta_1 = 0\)`
  * `\(H_A: \beta_1 \neq 0\)`

--
* What is the result of this hypothesis test at the `\(\alpha = 0.05\)` level?

---

# <i class="fas fa-laptop"></i> `Putt Length`

- Go to RStudio Cloud and open `Putt Length`

---

## Log Likelihood

* "goodness of fit" measure
* **higher** log likelihood is better
* Both AIC and BIC are calculated using the **log likelihood**
  * `\(\Large f(k) - 2 \log\mathcal{L}\)`

---

## Log Likelihood

* "goodness of fit" measure
* **higher** log likelihood is better
* Both AIC and BIC are calculated using the **log likelihood**
  * `\(\Large f(k) \color{red}{- 2 \log\mathcal{L}}\)`

--
* `\(\color{red}{- 2 \log\mathcal{L}}\)` - this is called the **deviance**

--
* Similar to the _nested F-test_ in linear regression, in logistic regression we can compare `\(-2\log\mathcal{L}\)` for models with and without certain predictors

--
* `\(-2\log\mathcal{L}\)` follows a `\(\chi^2\)` distribution with `\(n - k - 1\)` degrees of freedom.

--
* The difference `\((-2\log\mathcal{L}_1)-(-2\log\mathcal{L}_2)\)` follows a `\(\chi^2\)` distribution with `\(p\)` degrees of freedom (where `\(p\)` is the difference in the number of predictors between Model 1 and Model 2)

---

## Likelihood ratio test

* For example, if we wanted to test the following hypothesis:
  * `\(H_0: \beta_1 = 0\)`
  * `\(H_A: \beta_1 \neq 0\)`

--
* We could compute the difference between the deviance for a model with `\(\beta_1\)` and without `\(\beta_1\)`.
  * Model 1: `\(log(odds) \approx \beta_0\)`
  * Model 2: `\(log(odds) \approx \beta_0 + \beta_1x\)`
  
---

## Likelihood ratio test

.question[
Are these models nested?
]

* For example, if we wanted to test the following hypothesis:
  * `\(H_0: \beta_1 = 0\)`
  * `\(H_A: \beta_1 \neq 0\)`
* We could compute the difference between the deviance for a model with `\(\beta_1\)` and without `\(\beta_1\)`.
  * Model 1: `\(log(odds) \approx \beta_0\)`
  * Model 2: `\(log(odds) \approx \beta_0 + \beta_1x\)`

---

## Likelihood ratio test

.question[
What are the degrees of freedom for the deviance for Model 1?
]
* For example, if we wanted to test the following hypothesis:
  * `\(H_0: \beta_1 = 0\)`
  * `\(H_A: \beta_1 \neq 0\)`
* We could compute the difference between the deviance for a model with `\(\beta_1\)` and without `\(\beta_1\)`.
  * Model 1: `\(log(odds) \approx \beta_0\)` 
  * Model 2: `\(log(odds) \approx \beta_0 + \beta_1x\)` 
  
---
## Likelihood ratio test

.question[
What are the degrees of freedom for the deviance for Model 1?
]
* For example, if we wanted to test the following hypothesis:
  * `\(H_0: \beta_1 = 0\)`
  * `\(H_A: \beta_1 \neq 0\)`
* We could compute the difference between the deviance for a model with `\(\beta_1\)` and without `\(\beta_1\)`.
  * Model 1: `\(log(odds) \approx \beta_0\)` ➡️ `\(-2\log\mathcal{L}_1\)`, df = `\(n-1\)`
  * Model 2: `\(log(odds) \approx \beta_0 + \beta_1x\)`

---

## Likelihood ratio test

.question[
What are the degrees of freedom for the deviance for Model 2?
]
* For example, if we wanted to test the following hypothesis:
  * `\(H_0: \beta_1 = 0\)`
  * `\(H_A: \beta_1 \neq 0\)`
* We could compute the difference between the deviance for a model with `\(\beta_1\)` and without `\(\beta_1\)`.
  * Model 1: `\(log(odds) \approx \beta_0\)` ➡️ `\(-2\log\mathcal{L}_1\)`, df = `\(n-1\)`
  * Model 2: `\(log(odds) \approx \beta_0 + \beta_1x\)` 
  
---
## Likelihood ratio test

.question[
What are the degrees of freedom for the deviance for Model 2?
]
* For example, if we wanted to test the following hypothesis:
  * `\(H_0: \beta_1 = 0\)`
  * `\(H_A: \beta_1 \neq 0\)`
* We could compute the difference between the deviance for a model with `\(\beta_1\)` and without `\(\beta_1\)`.
  * Model 1: `\(log(odds) \approx \beta_0\)` ➡️ `\(-2\log\mathcal{L}_1\)`, df = `\(n-1\)`
  * Model 2: `\(log(odds) \approx \beta_0 + \beta_1x\)` ➡️ `\(-2\log\mathcal{L}_2\)`, df = `\(n-2\)`

---

## Likelihood ratio test

* We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

--

.center[
`\(\Large(-2\log\mathcal{L}_1) - (-2\log\mathcal{L}_2)\)`
]

---

## Likelihood ratio test

.question[
What do you think the degrees of freedom are for this difference?
]

* We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

.center[
`\(\Large(-2\log\mathcal{L}_1) - (-2\log\mathcal{L}_2)\)`
]

--

* df: `\((n-1) - (n-2) = 1\)`

---

## Likelihood ratio test

.question[
What is the null hypothesis again?
]

* We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

.center[
`\(\Large(-2\log\mathcal{L}_1) - (-2\log\mathcal{L}_2)\)` 👈 test statistic
]

* df: `\((n-1) - (n-2) = 1\)`

---

## Likelihood ratio test

.question[
How do you think we compute a p-value for this test?
]

* We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

.center[
`\(\Large(-2\log\mathcal{L}_1) - (-2\log\mathcal{L}_2)\)` 👈 test statistic
]

* df: `\((n-1) - (n-2) = 1\)`

--

```r
pchisq(L_0 - L, df = 1, lower.tail = FALSE)
```

---

## Let's try it in R!

.small[

```r
data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
* glance()
```

```
## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53
```
]

---

## Let's try it in R!

.question[
What is the "drop in deviance"?
]

.small[

```r
data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()
```

```
## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53
```
]

--

*  75.8 - 56.8 = 19

---

## Let's try it in R!

.question[
What are the degrees of freedom for this difference?
]

.small[

```r
data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()
```

```
## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53
```
]

*  75.8 - 56.8 = 19

--
* df: 1

---

## Let's try it in R!

.question[
What is the result of the hypothesis test? How do you interpret this?
]

.small[

```r
data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()
```

```
## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53
```
]

*  75.8 - 56.8 = 19
* df: 1

```r
pchisq(19, 1, lower.tail = FALSE)
```

```
## [1] 1.31e-05
```

---

# <i class="fas fa-laptop"></i> `Putt Length`

- Go to RStudio Cloud and open `Putt Length`
- Fit a logistic regression predicting whether the shot was `Made` from the `Length` of the Putt
- Calculate the "drop in deviance" for comparing the model with and without `Length`
- Calculate the p-value for this Likelihood ratio test
- Interpret this result

Notes for current slide

Notes for next slide

Logistic regression inference1 / 33

by Dr. Lucy D'Agostino McGowan

 Putt LengthGo to RStudio Cloud and open Putt Length
2 / 33

by Dr. Lucy D'Agostino McGowan

InferenceEven if your check of conditions convinces you that the Bernoulli (spinner) model is not appropriate, you can still use logistic regression for description, and sometimes for prediction
3 / 33

by Dr. Lucy D'Agostino McGowan

InferenceEven if your check of conditions convinces you that the Bernoulli (spinner) model is not appropriate, you can still use logistic regression for description, and sometimes for prediction
If the outcomes are random and independent, you can also do inference! test hypotheses
construct confidence intervals

3 / 33

by Dr. Lucy D'Agostino McGowan

Hypothesis testnull hypothesis H0:β1=0H0:β1=0 
alternative hypothesis HA:β1≠0HA:β1≠0
4 / 33

Logistic regression test statistic

$z = \frac{{\hat{β}}_{1}}{S E_{{\hat{β}}_{1}}}$

5 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

$z = \frac{{\hat{β}}_{1}}{S E_{{\hat{β}}_{1}}}$

6 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

$z = \frac{{\hat{β}}_{1}}{S E_{{\hat{β}}_{1}}}$

7 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

$z = \frac{{\hat{β}}_{1}}{S E_{{\hat{β}}_{1}}}$

The $z$ denotes that this is a $z$ -statistic

7 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

$z = \frac{{\hat{β}}_{1}}{S E_{{\hat{β}}_{1}}}$

The $z$ denotes that this is a $z$ -statistic
What does this mean? Instead of using a $t$ distribution, we use a normal distribution to calculate the confidence intervals and p-values

7 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of $t^{*}$ as it was for linear regression)?

${\hat{β}}_{1} \pm [_^{*}] S E_{{\hat{β}}_{1}}$

8 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of $t^{*}$ as it was for linear regression)?

${\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}$

9 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of $t^{*}$ as it was for linear regression)?

${\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}$

$z^{*}$ is found using the normal distribution and the desired level of confidence

9 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of $t^{*}$ as it was for linear regression)?

${\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}$

$z^{*}$ is found using the normal distribution and the desired level of confidence
```
qnorm(0.975)
```
```
## [1] 1.96
```

9 / 33

Logistic regression confidence interval

Where are my degrees of freedom when calculating $z^{*}$ ?

${\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}$

$z^{*}$ is found using the normal distribution and the desired level of confidence
```
qnorm(0.975)
```
```
## [1] 1.96
```

10 / 33

Logistic regression confidence interval

Where are my degrees of freedom when calculating $z^{*}$ ?

${\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}$

$z^{*}$ is found using the normal distribution and the desired level of confidence
```
qnorm(0.975)
```
```
## [1] 1.96
```
The normal distribution doesn't need to know your sample size but it does rely on reasonably large sample

10 / 33

Logistic regression confidence interval

${\hat{β}}_{1}$ measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the odds ratio?

${\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}$

11 / 33

Logistic regression confidence interval

How do you convert log(odds) to odds?

${\hat{β}}_{1}$ measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the odds ratio?

${\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}$

12 / 33

Logistic regression confidence interval

How do you convert log(odds) to odds?

${\hat{β}}_{1}$ measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the odds ratio?

$e^{{\hat{β}}_{1} \pm [z^{*}] S E_{{\hat{β}}_{1}}}$

13 / 33

Let's try it in R!

We are interested in the relationship between Backpack weight and Back problems.

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
  tidy(exponentiate = TRUE, conf.int = TRUE)

## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13

14 / 33

Let's try it in R!

We are interested in the relationship between Backpack weight and Back problems.

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
  tidy(exponentiate = TRUE, conf.int = TRUE)

## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13

How do you interpret the Odds ratio?

14 / 33

Let's try it in R!

We are interested in the relationship between Backpack weight and Back problems.

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
  tidy(exponentiate = TRUE, conf.int = TRUE)

## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13

How do you interpret the Odds ratio?
- A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems

14 / 33

by Dr. Lucy D'Agostino McGowan

Let's try it in R!data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
  tidy(exponentiate = TRUE, conf.int = TRUE)

## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13
How do you interpret the Odds ratio?A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems

What is my null hypothesis?
15 / 33

by Dr. Lucy D'Agostino McGowan

Let's try it in R!data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
  tidy(exponentiate = TRUE, conf.int = TRUE)

## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13
How do you interpret the Odds ratio?A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems

What is my null hypothesis?H0:β1=0H0:β1=0
HA:β1≠0HA:β1≠0

15 / 33

by Dr. Lucy D'Agostino McGowan

Let's try it in R!data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
  tidy(exponentiate = TRUE, conf.int = TRUE)

## # A tibble: 2 x 7
##   term           estimate std.error statistic p.value conf.low conf.high
##   <chr>             <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 (Intercept)       0.281    0.496      -2.56  0.0105    0.102     0.725
## 2 BackpackWeight    1.04     0.0370      1.18  0.239     0.971     1.13
How do you interpret the Odds ratio?A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems

What is my null hypothesis?H0:β1=0H0:β1=0
HA:β1≠0HA:β1≠0

What is the result of this hypothesis test at the α=0.05α=0.05 level?
15 / 33

by Dr. Lucy D'Agostino McGowan

 Putt LengthGo to RStudio Cloud and open Putt Length
16 / 33

by Dr. Lucy D'Agostino McGowan

Log Likelihood"goodness of fit" measure
higher log likelihood is better
Both AIC and BIC are calculated using the log likelihoodf(k)−2logLf(k)−2log⁡L

17 / 33

by Dr. Lucy D'Agostino McGowan

Log Likelihood"goodness of fit" measure
higher log likelihood is better
Both AIC and BIC are calculated using the log likelihoodf(k)−2logLf(k)−2log⁡L

18 / 33

by Dr. Lucy D'Agostino McGowan

Log Likelihood"goodness of fit" measure
higher log likelihood is better
Both AIC and BIC are calculated using the log likelihoodf(k)−2logLf(k)−2log⁡L

−2logL−2log⁡L - this is called the deviance
18 / 33

by Dr. Lucy D'Agostino McGowan

Log Likelihood"goodness of fit" measure
higher log likelihood is better
Both AIC and BIC are calculated using the log likelihoodf(k)−2logLf(k)−2log⁡L

−2logL−2log⁡L - this is called the deviance
Similar to the nested F-test in linear regression, in logistic regression we can compare −2logL−2log⁡L for models with and without certain predictors
18 / 33

by Dr. Lucy D'Agostino McGowan

Log Likelihood"goodness of fit" measure
higher log likelihood is better
Both AIC and BIC are calculated using the log likelihoodf(k)−2logLf(k)−2log⁡L

−2logL−2log⁡L - this is called the deviance
Similar to the nested F-test in linear regression, in logistic regression we can compare −2logL−2log⁡L for models with and without certain predictors
−2logL−2log⁡L follows a χ2χ2 distribution with n−k−1n−k−1 degrees of freedom.
18 / 33

by Dr. Lucy D'Agostino McGowan

Log Likelihood"goodness of fit" measure
higher log likelihood is better
Both AIC and BIC are calculated using the log likelihoodf(k)−2logLf(k)−2log⁡L

−2logL−2log⁡L - this is called the deviance
Similar to the nested F-test in linear regression, in logistic regression we can compare −2logL−2log⁡L for models with and without certain predictors
−2logL−2log⁡L follows a χ2χ2 distribution with n−k−1n−k−1 degrees of freedom.
The difference (−2logL1)−(−2logL2)(−2log⁡L1)−(−2log⁡L2) follows a χ2χ2 distribution with pp degrees of freedom (where pp is the difference in the number of predictors between Model 1 and Model 2)
18 / 33

by Dr. Lucy D'Agostino McGowan

Likelihood ratio testFor example, if we wanted to test the following hypothesis:H0:β1=0H0:β1=0
HA:β1≠0HA:β1≠0

19 / 33

by Dr. Lucy D'Agostino McGowan

Likelihood ratio testFor example, if we wanted to test the following hypothesis:H0:β1=0H0:β1=0
HA:β1≠0HA:β1≠0

We could compute the difference between the deviance for a model with β1β1 and without β1β1.Model 1: log(odds)≈β0log(odds)≈β0
Model 2: log(odds)≈β0+β1xlog(odds)≈β0+β1x

19 / 33

Likelihood ratio test

Are these models nested?

For example, if we wanted to test the following hypothesis:
- $H_{0} : β_{1} = 0$
- $H_{A} : β_{1} \neq 0$
We could compute the difference between the deviance for a model with $β_{1}$ and without $β_{1}$ .
- Model 1: $l o g (o d d s) \approx β_{0}$
- Model 2: $l o g (o d d s) \approx β_{0} + β_{1} x$

20 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 1?

For example, if we wanted to test the following hypothesis:
- $H_{0} : β_{1} = 0$
- $H_{A} : β_{1} \neq 0$
We could compute the difference between the deviance for a model with $β_{1}$ and without $β_{1}$ .
- Model 1: $l o g (o d d s) \approx β_{0}$
- Model 2: $l o g (o d d s) \approx β_{0} + β_{1} x$

21 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 1?

For example, if we wanted to test the following hypothesis:
- $H_{0} : β_{1} = 0$
- $H_{A} : β_{1} \neq 0$
We could compute the difference between the deviance for a model with $β_{1}$ and without $β_{1}$ .
- Model 1: $l o g (o d d s) \approx β_{0}$ ➡️ $- 2 \log L_{1}$ , df = $n - 1$
- Model 2: $l o g (o d d s) \approx β_{0} + β_{1} x$

22 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 2?

For example, if we wanted to test the following hypothesis:
- $H_{0} : β_{1} = 0$
- $H_{A} : β_{1} \neq 0$
We could compute the difference between the deviance for a model with $β_{1}$ and without $β_{1}$ .
- Model 1: $l o g (o d d s) \approx β_{0}$ ➡️ $- 2 \log L_{1}$ , df = $n - 1$
- Model 2: $l o g (o d d s) \approx β_{0} + β_{1} x$

23 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 2?

For example, if we wanted to test the following hypothesis:
- $H_{0} : β_{1} = 0$
- $H_{A} : β_{1} \neq 0$
We could compute the difference between the deviance for a model with $β_{1}$ and without $β_{1}$ .
- Model 1: $l o g (o d d s) \approx β_{0}$ ➡️ $- 2 \log L_{1}$ , df = $n - 1$
- Model 2: $l o g (o d d s) \approx β_{0} + β_{1} x$ ➡️ $- 2 \log L_{2}$ , df = $n - 2$

24 / 33

by Dr. Lucy D'Agostino McGowan

Likelihood ratio testWe are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2
25 / 33

Likelihood ratio test

We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

$(- 2 \log L_{1}) - (- 2 \log L_{2})$

25 / 33

Likelihood ratio test

What do you think the degrees of freedom are for this difference?

We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

$(- 2 \log L_{1}) - (- 2 \log L_{2})$

26 / 33

Likelihood ratio test

What do you think the degrees of freedom are for this difference?

We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

$(- 2 \log L_{1}) - (- 2 \log L_{2})$

df: $(n - 1) - (n - 2) = 1$

26 / 33

Likelihood ratio test

What is the null hypothesis again?

We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

$(- 2 \log L_{1}) - (- 2 \log L_{2})$ 👈 test statistic

df: $(n - 1) - (n - 2) = 1$

27 / 33

Likelihood ratio test

How do you think we compute a p-value for this test?

We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

$(- 2 \log L_{1}) - (- 2 \log L_{2})$ 👈 test statistic

df: $(n - 1) - (n - 2) = 1$

28 / 33

Likelihood ratio test

How do you think we compute a p-value for this test?

We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

$(- 2 \log L_{1}) - (- 2 \log L_{2})$ 👈 test statistic

df: $(n - 1) - (n - 2) = 1$

pchisq(L_0 - L, df = 1, lower.tail = FALSE)

28 / 33

by Dr. Lucy D'Agostino McGowan

Let's try it in R!data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()

## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53
29 / 33

Let's try it in R!

What is the "drop in deviance"?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()

## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53

30 / 33

Let's try it in R!

What is the "drop in deviance"?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()

## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53

75.8 - 56.8 = 19

30 / 33

Let's try it in R!

What are the degrees of freedom for this difference?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()

## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53

75.8 - 56.8 = 19

31 / 33

Let's try it in R!

What are the degrees of freedom for this difference?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()

## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53

75.8 - 56.8 = 19
df: 1

31 / 33

Let's try it in R!

What is the result of the hypothesis test? How do you interpret this?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
  glance()

## # A tibble: 1 x 7
##   null.deviance df.null logLik   AIC   BIC deviance df.residual
##           <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int>
## 1          75.8      54  -28.4  60.8  64.9     56.8          53

75.8 - 56.8 = 19
df: 1

pchisq(19, 1, lower.tail = FALSE)

## [1] 1.31e-05

32 / 33

by Dr. Lucy D'Agostino McGowan

 Putt LengthGo to RStudio Cloud and open Putt Length
Fit a logistic regression predicting whether the shot was Made from the Length of the Putt
Calculate the "drop in deviance" for comparing the model with and without Length
Calculate the p-value for this Likelihood ratio test
Interpret this result
33 / 33

by Dr. Lucy D'Agostino McGowan

 Putt LengthGo to RStudio Cloud and open Putt Length
2 / 33

Paused

Help

Keyboard shortcuts

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Esc	Back to slideshow