+ - 0:00:00
Notes for current slide
Notes for next slide

Logistic regression inference

1 / 33

Putt Length

  • Go to RStudio Cloud and open Putt Length
2 / 33

Inference

  • Even if your check of conditions convinces you that the Bernoulli (spinner) model is not appropriate, you can still use logistic regression for description, and sometimes for prediction
3 / 33

Inference

  • Even if your check of conditions convinces you that the Bernoulli (spinner) model is not appropriate, you can still use logistic regression for description, and sometimes for prediction
  • If the outcomes are random and independent, you can also do inference!
    • test hypotheses
    • construct confidence intervals
3 / 33

Hypothesis test

  • null hypothesis H0:β1=0
  • alternative hypothesis HA:β10
4 / 33

Logistic regression test statistic



z=β^1SEβ^1

5 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

z=β^1SEβ^1

6 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

z=β^1SEβ^1

7 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

z=β^1SEβ^1

  • The z denotes that this is a z-statistic
7 / 33

Logistic regression test statistic

How is this different from the test statistic for linear regression?

z=β^1SEβ^1

  • The z denotes that this is a z-statistic
  • What does this mean? Instead of using a t distribution, we use a normal distribution to calculate the confidence intervals and p-values
7 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of t as it was for linear regression)?

β^1±[_]SEβ^1

8 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of t as it was for linear regression)?

β^1±[z]SEβ^1

9 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of t as it was for linear regression)?

β^1±[z]SEβ^1

  • z is found using the normal distribution and the desired level of confidence
9 / 33

Logistic regression confidence interval

What do you think goes in this blank to calculate a confidence interval (instead of t as it was for linear regression)?

β^1±[z]SEβ^1

  • z is found using the normal distribution and the desired level of confidence
    qnorm(0.975)
    ## [1] 1.96
9 / 33

Logistic regression confidence interval

Where are my degrees of freedom when calculating z?

β^1±[z]SEβ^1

  • z is found using the normal distribution and the desired level of confidence
    qnorm(0.975)
    ## [1] 1.96
10 / 33

Logistic regression confidence interval

Where are my degrees of freedom when calculating z?

β^1±[z]SEβ^1

  • z is found using the normal distribution and the desired level of confidence
    qnorm(0.975)
    ## [1] 1.96
  • The normal distribution doesn't need to know your sample size but it does rely on reasonably large sample
10 / 33

Logistic regression confidence interval

  • β^1 measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the odds ratio?

β^1±[z]SEβ^1

11 / 33

Logistic regression confidence interval

How do you convert log(odds) to odds?

  • β^1 measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the odds ratio?

β^1±[z]SEβ^1

12 / 33

Logistic regression confidence interval

How do you convert log(odds) to odds?

  • β^1 measures the change in log(odds) for every unit change in the predictor. What if I wanted a confidence interval for the odds ratio?

eβ^1±[z]SEβ^1

13 / 33

Let's try it in R!

We are interested in the relationship between Backpack weight and Back problems.

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
tidy(exponentiate = TRUE, conf.int = TRUE)
## # A tibble: 2 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.281 0.496 -2.56 0.0105 0.102 0.725
## 2 BackpackWeight 1.04 0.0370 1.18 0.239 0.971 1.13
14 / 33

Let's try it in R!

We are interested in the relationship between Backpack weight and Back problems.

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
tidy(exponentiate = TRUE, conf.int = TRUE)
## # A tibble: 2 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.281 0.496 -2.56 0.0105 0.102 0.725
## 2 BackpackWeight 1.04 0.0370 1.18 0.239 0.971 1.13
  • How do you interpret the Odds ratio?
14 / 33

Let's try it in R!

We are interested in the relationship between Backpack weight and Back problems.

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
tidy(exponentiate = TRUE, conf.int = TRUE)
## # A tibble: 2 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.281 0.496 -2.56 0.0105 0.102 0.725
## 2 BackpackWeight 1.04 0.0370 1.18 0.239 0.971 1.13
  • How do you interpret the Odds ratio?
    • A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems
14 / 33

Let's try it in R!

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
tidy(exponentiate = TRUE, conf.int = TRUE)
## # A tibble: 2 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.281 0.496 -2.56 0.0105 0.102 0.725
## 2 BackpackWeight 1.04 0.0370 1.18 0.239 0.971 1.13
  • How do you interpret the Odds ratio?
    • A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems
  • What is my null hypothesis?
15 / 33

Let's try it in R!

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
tidy(exponentiate = TRUE, conf.int = TRUE)
## # A tibble: 2 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.281 0.496 -2.56 0.0105 0.102 0.725
## 2 BackpackWeight 1.04 0.0370 1.18 0.239 0.971 1.13
  • How do you interpret the Odds ratio?
    • A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems
  • What is my null hypothesis?
    • H0:β1=0
    • HA:β10
15 / 33

Let's try it in R!

data("Backpack")
glm(BackProblems ~ BackpackWeight, data = Backpack, family = "binomial") %>%
tidy(exponentiate = TRUE, conf.int = TRUE)
## # A tibble: 2 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.281 0.496 -2.56 0.0105 0.102 0.725
## 2 BackpackWeight 1.04 0.0370 1.18 0.239 0.971 1.13
  • How do you interpret the Odds ratio?
    • A one unit increase in backpack weight yields a 1.04-fold increase in the odds of back problems
  • What is my null hypothesis?
    • H0:β1=0
    • HA:β10
  • What is the result of this hypothesis test at the α=0.05 level?
15 / 33

Putt Length

  • Go to RStudio Cloud and open Putt Length
16 / 33

Log Likelihood

  • "goodness of fit" measure
  • higher log likelihood is better
  • Both AIC and BIC are calculated using the log likelihood
    • f(k)2logL
17 / 33

Log Likelihood

  • "goodness of fit" measure
  • higher log likelihood is better
  • Both AIC and BIC are calculated using the log likelihood
    • f(k)2logL
18 / 33

Log Likelihood

  • "goodness of fit" measure
  • higher log likelihood is better
  • Both AIC and BIC are calculated using the log likelihood
    • f(k)2logL
  • 2logL - this is called the deviance
18 / 33

Log Likelihood

  • "goodness of fit" measure
  • higher log likelihood is better
  • Both AIC and BIC are calculated using the log likelihood
    • f(k)2logL
  • 2logL - this is called the deviance
  • Similar to the nested F-test in linear regression, in logistic regression we can compare 2logL for models with and without certain predictors
18 / 33

Log Likelihood

  • "goodness of fit" measure
  • higher log likelihood is better
  • Both AIC and BIC are calculated using the log likelihood
    • f(k)2logL
  • 2logL - this is called the deviance
  • Similar to the nested F-test in linear regression, in logistic regression we can compare 2logL for models with and without certain predictors
  • 2logL follows a χ2 distribution with nk1 degrees of freedom.
18 / 33

Log Likelihood

  • "goodness of fit" measure
  • higher log likelihood is better
  • Both AIC and BIC are calculated using the log likelihood
    • f(k)2logL
  • 2logL - this is called the deviance
  • Similar to the nested F-test in linear regression, in logistic regression we can compare 2logL for models with and without certain predictors
  • 2logL follows a χ2 distribution with nk1 degrees of freedom.
  • The difference (2logL1)(2logL2) follows a χ2 distribution with p degrees of freedom (where p is the difference in the number of predictors between Model 1 and Model 2)
18 / 33

Likelihood ratio test

  • For example, if we wanted to test the following hypothesis:
    • H0:β1=0
    • HA:β10
19 / 33

Likelihood ratio test

  • For example, if we wanted to test the following hypothesis:
    • H0:β1=0
    • HA:β10
  • We could compute the difference between the deviance for a model with β1 and without β1.
    • Model 1: log(odds)β0
    • Model 2: log(odds)β0+β1x
19 / 33

Likelihood ratio test

Are these models nested?

  • For example, if we wanted to test the following hypothesis:
    • H0:β1=0
    • HA:β10
  • We could compute the difference between the deviance for a model with β1 and without β1.
    • Model 1: log(odds)β0
    • Model 2: log(odds)β0+β1x
20 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 1?

  • For example, if we wanted to test the following hypothesis:
    • H0:β1=0
    • HA:β10
  • We could compute the difference between the deviance for a model with β1 and without β1.
    • Model 1: log(odds)β0
    • Model 2: log(odds)β0+β1x
21 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 1?

  • For example, if we wanted to test the following hypothesis:
    • H0:β1=0
    • HA:β10
  • We could compute the difference between the deviance for a model with β1 and without β1.
    • Model 1: log(odds)β0 ➡️ 2logL1, df = n1
    • Model 2: log(odds)β0+β1x
22 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 2?

  • For example, if we wanted to test the following hypothesis:
    • H0:β1=0
    • HA:β10
  • We could compute the difference between the deviance for a model with β1 and without β1.
    • Model 1: log(odds)β0 ➡️ 2logL1, df = n1
    • Model 2: log(odds)β0+β1x
23 / 33

Likelihood ratio test

What are the degrees of freedom for the deviance for Model 2?

  • For example, if we wanted to test the following hypothesis:
    • H0:β1=0
    • HA:β10
  • We could compute the difference between the deviance for a model with β1 and without β1.
    • Model 1: log(odds)β0 ➡️ 2logL1, df = n1
    • Model 2: log(odds)β0+β1x ➡️ 2logL2, df = n2
24 / 33

Likelihood ratio test

  • We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2
25 / 33

Likelihood ratio test

  • We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

(2logL1)(2logL2)

25 / 33

Likelihood ratio test

What do you think the degrees of freedom are for this difference?

  • We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

(2logL1)(2logL2)

26 / 33

Likelihood ratio test

What do you think the degrees of freedom are for this difference?

  • We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

(2logL1)(2logL2)

  • df: (n1)(n2)=1
26 / 33

Likelihood ratio test

What is the null hypothesis again?

  • We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

(2logL1)(2logL2) 👈 test statistic

  • df: (n1)(n2)=1
27 / 33

Likelihood ratio test

How do you think we compute a p-value for this test?

  • We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

(2logL1)(2logL2) 👈 test statistic

  • df: (n1)(n2)=1
28 / 33

Likelihood ratio test

How do you think we compute a p-value for this test?

  • We are interested in the "drop in deviance", the deviance in Model 1 minus the deviance in Model 2

(2logL1)(2logL2) 👈 test statistic

  • df: (n1)(n2)=1
pchisq(L_0 - L, df = 1, lower.tail = FALSE)
28 / 33

Let's try it in R!

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
glance()
## # A tibble: 1 x 7
## null.deviance df.null logLik AIC BIC deviance df.residual
## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>
## 1 75.8 54 -28.4 60.8 64.9 56.8 53
29 / 33

Let's try it in R!

What is the "drop in deviance"?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
glance()
## # A tibble: 1 x 7
## null.deviance df.null logLik AIC BIC deviance df.residual
## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>
## 1 75.8 54 -28.4 60.8 64.9 56.8 53
30 / 33

Let's try it in R!

What is the "drop in deviance"?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
glance()
## # A tibble: 1 x 7
## null.deviance df.null logLik AIC BIC deviance df.residual
## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>
## 1 75.8 54 -28.4 60.8 64.9 56.8 53
  • 75.8 - 56.8 = 19
30 / 33

Let's try it in R!

What are the degrees of freedom for this difference?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
glance()
## # A tibble: 1 x 7
## null.deviance df.null logLik AIC BIC deviance df.residual
## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>
## 1 75.8 54 -28.4 60.8 64.9 56.8 53
  • 75.8 - 56.8 = 19
31 / 33

Let's try it in R!

What are the degrees of freedom for this difference?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
glance()
## # A tibble: 1 x 7
## null.deviance df.null logLik AIC BIC deviance df.residual
## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>
## 1 75.8 54 -28.4 60.8 64.9 56.8 53
  • 75.8 - 56.8 = 19
  • df: 1
31 / 33

Let's try it in R!

What is the result of the hypothesis test? How do you interpret this?

data(MedGPA)
glm(Acceptance ~ GPA, data = MedGPA, family = "binomial") %>%
glance()
## # A tibble: 1 x 7
## null.deviance df.null logLik AIC BIC deviance df.residual
## <dbl> <int> <dbl> <dbl> <dbl> <dbl> <int>
## 1 75.8 54 -28.4 60.8 64.9 56.8 53
  • 75.8 - 56.8 = 19
  • df: 1
pchisq(19, 1, lower.tail = FALSE)
## [1] 1.31e-05
32 / 33

Putt Length

  • Go to RStudio Cloud and open Putt Length
  • Fit a logistic regression predicting whether the shot was Made from the Length of the Putt
  • Calculate the "drop in deviance" for comparing the model with and without Length
  • Calculate the p-value for this Likelihood ratio test
  • Interpret this result
33 / 33

Putt Length

  • Go to RStudio Cloud and open Putt Length
2 / 33
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow