Model Comparisons

Model Comparisons1 / 31

by Dr. Lucy D'Agostino McGowan

 First Year GPAGo to RStudio Cloud and open First Year GPA
2 / 31

by Dr. Lucy D'Agostino McGowan

🛠 toolkit for comparing models3 / 31

by Dr. Lucy D'Agostino McGowan

🛠 toolkit for comparing models👉  F-test3 / 31

by Dr. Lucy D'Agostino McGowan

🛠 toolkit for comparing models👉  F-test👉 R23 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionComparing the full model to the intercept only model
4 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionComparing the full model to the intercept only model
H0:β1=β2=⋯=βk=0
4 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionComparing the full model to the intercept only model
H0:β1=β2=⋯=βk=0
HA:at least one βi≠0
4 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionF=MSModelMSE
5 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionF=MSModelMSE
df for the Model?
5 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionF=MSModelMSE
df for the Model?k

5 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionF=MSModelMSE
df for the Model?k

df for the errors?
5 / 31

by Dr. Lucy D'Agostino McGowan

🛠 F-test for Multiple Linear RegressionF=MSModelMSE
df for the Model?k

df for the errors?n - k - 1

5 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionWhat does "nested" mean?You have a "small" model and a "large" model where the "small" model is completely contained in the "large" model

6 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionWhat does "nested" mean?You have a "small" model and a "large" model where the "small" model is completely contained in the "large" model

The F-test we have learned so far is one example of this, comparing:y=β0+ϵ (small)
y=β0+β1+β2+⋯+βk+ϵ (large)

6 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionWhat does "nested" mean?You have a "small" model and a "large" model where the "small" model is completely contained in the "large" model

The F-test we have learned so far is one example of this, comparing:y=β0+ϵ (small)
y=β0+β1+β2+⋯+βk+ϵ (large)

The full (large) model has k predictors, the reduced (small) model has k−p predictors
6 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionThe full (large) model has k predictors, the reduced (small) model has k−p predictors
7 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionThe full (large) model has k predictors, the reduced (small) model has k−p predictors
What is H0?
7 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionThe full (large) model has k predictors, the reduced (small) model has k−p predictors
What is H0?H0: βi=0 for all p predictors being dropped from the full model

7 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionThe full (large) model has k predictors, the reduced (small) model has k−p predictors
What is H0?H0: βi=0 for all p predictors being dropped from the full model

What is HA?
7 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionThe full (large) model has k predictors, the reduced (small) model has k−p predictors
What is H0?H0: βi=0 for all p predictors being dropped from the full model

What is HA?HA: βi≠0 for at least one of the p predictors dropped from the full model

7 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionThe full (large) model has k predictors, the reduced (small) model has k−p predictors
What is H0?H0: βi=0 for all p predictors being dropped from the full model

What is HA?HA: βi≠0 for at least one of the p predictors dropped from the full model

Does the full model do a (statistically significant) better job of explaining the variability in the response than the reduced model?
7 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionThe full (large) model has k predictors, the reduced (small) model has k−p predictors
F=SSMODELFull−SSMODELReduced/pSSEFull/n−k−1
8 / 31

🛠 Nested F-test for Multiple Linear Regression

Which of these are nested models?

(1) $y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + ϵ$
(2) $y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{1} * x_{2} + ϵ$
(3) $y = β_{0} + β_{1} x_{3} + ϵ$
(4) $y = β_{0} + β_{2} x_{2} + ϵ$
(5) $y = β_{0} + β_{1} x_{4} + ϵ$

9 / 31

🛠 Nested F-test for Multiple Linear Regression

Which of these are nested models?

(4) in (1) in (2)

9 / 31

🛠 Nested F-test for Multiple Linear Regression

Comparing these two models, what is $p$ ?

(1) $y = β_{0} + β_{2} x_{2} + ϵ$
(2) $y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{1} * x_{2} + ϵ$

10 / 31

🛠 Nested F-test for Multiple Linear Regression

Comparing these two models, what is $p$ ?

(1) $y = β_{0} + β_{2} x_{2} + ϵ$
(2) $y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{1} * x_{2} + ϵ$

$p = 2$

10 / 31

🛠 Nested F-test for Multiple Linear Regression

Comparing these two models, what is $p$ ?

(1) $y = β_{0} + β_{2} x_{2} + ϵ$
(2) $y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{1} * x_{2} + ϵ$

$p = 2$
What is $k$ ?

10 / 31

🛠 Nested F-test for Multiple Linear Regression

Comparing these two models, what is $p$ ?

(1) $y = β_{0} + β_{2} x_{2} + ϵ$
(2) $y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{1} * x_{2} + ϵ$

$p = 2$
What is $k$ ?
$k = 3$

10 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionGoal: Trying to predict the weight of fish based on their length and width
data("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

11 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionGoal: Trying to predict the weight of fish based on their length and width
data("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

What is the equation for model1?
11 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionGoal: Trying to predict the weight of fish based on their length and width
data("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

What is the equation for model1?
What is the equation for model2?
11 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressiondata("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

12 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressiondata("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

If we want to do a nested F-test, what is H0?
12 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressiondata("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

If we want to do a nested F-test, what is H0?H0:β3=β4=0

12 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressiondata("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

If we want to do a nested F-test, what is H0?H0:β3=β4=0

What is HA?
12 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressiondata("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

If we want to do a nested F-test, what is H0?H0:β3=β4=0

What is HA?HA:β3≠0 or β4≠0

12 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressiondata("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

If we want to do a nested F-test, what is H0?H0:β3=β4=0

What is HA?HA:β3≠0 or β4≠0

What are the degrees of freedom of this test? (n = 56)
12 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressiondata("Perch")
model1 <- lm(
  Weight ~ Length + Width + Length * Width,
  data = Perch
  )
model2 <- lm(
  Weight ~ Length + Width + I(Length ^ 2) + I(Width ^ 2) + Length * Width,
  data = Perch
  )

If we want to do a nested F-test, what is H0?H0:β3=β4=0

What is HA?HA:β3≠0 or β4≠0

What are the degrees of freedom of this test? (n = 56)2, 50

12 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressionanova(model1)

## Analysis of Variance Table
## 
## Response: Weight
##              Df  Sum Sq Mean Sq F value  Pr(>F)
## Length        1 6118739 6118739  3126.6 < 2e-16
## Width         1  110593  110593    56.5 7.4e-10
## Length:Width  1  314997  314997   161.0 < 2e-16
## Residuals    52  101765    1957
(SSModel1 <- 6118739 + 110593 + 314997)

## [1] 6544329
13 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear Regressionanova(model2)

## Analysis of Variance Table
## 
## Response: Weight
##              Df  Sum Sq Mean Sq F value  Pr(>F)
## Length        1 6118739 6118739 3289.64 < 2e-16
## Width         1  110593  110593   59.46 4.7e-10
## I(Length^2)   1  314899  314899  169.30 < 2e-16
## I(Width^2)    1    5381    5381    2.89   0.095
## Length:Width  1    3482    3482    1.87   0.177
## Residuals    50   93000    1860
(SSModel1 <- 6118739 + 110593 + 314997)

## [1] 6544329
(SSModel2 <- 6118739 + 110593 + 314899 + 5381 + 3482)

## [1] 6553094
14 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionF=SSMODELFull−SSMODELReduced/pSSEFull/n−k−1
15 / 31

🛠 Nested F-test for Multiple Linear Regression

$F = \frac{S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d} / p}{S S E_{F u l l} / n - k - 1}$
$S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d}$ :

SSModel2 - SSModel1

## [1] 8765

15 / 31

🛠 Nested F-test for Multiple Linear Regression

$F = \frac{S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d} / p}{S S E_{F u l l} / n - k - 1}$
$S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d}$ :

SSModel2 - SSModel1

## [1] 8765

What is $p$ ?

15 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionF=SSMODELFull−SSMODELReduced/pSSEFull/n−k−1
16 / 31

🛠 Nested F-test for Multiple Linear Regression

$F = \frac{S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d} / p}{S S E_{F u l l} / n - k - 1}$
$S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d}$ / p:

(SSModel2 - SSModel1) / 2

## [1] 4382

16 / 31

by Dr. Lucy D'Agostino McGowan

🛠 Nested F-test for Multiple Linear RegressionF=SSMODELFull−SSMODELReduced/pSSEFull/n−k−1
SSEFull/n−k−1
anova(model2)

## Analysis of Variance Table
## 
## Response: Weight
##              Df  Sum Sq Mean Sq F value  Pr(>F)
## Length        1 6118739 6118739 3289.64 < 2e-16
## Width         1  110593  110593   59.46 4.7e-10
## I(Length^2)   1  314899  314899  169.30 < 2e-16
## I(Width^2)    1    5381    5381    2.89   0.095
## Length:Width  1    3482    3482    1.87   0.177
## Residuals    50   93000    1860
17 / 31

🛠 Nested F-test for Multiple Linear Regression

$F = \frac{S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d} / p}{S S E_{F u l l} / n - k - 1}$

((SSModel2 - SSModel1) / 2) /
  1860

## [1] 2.36

18 / 31

🛠 Nested F-test for Multiple Linear Regression

$F = \frac{S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d} / p}{S S E_{F u l l} / n - k - 1}$

((SSModel2 - SSModel1) / 2) /
  1860

## [1] 2.36

What are the degrees of freedom for this test?

18 / 31

🛠 Nested F-test for Multiple Linear Regression

$F = \frac{S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d} / p}{S S E_{F u l l} / n - k - 1}$

((SSModel2 - SSModel1) / 2) /
  1860

## [1] 2.36

What are the degrees of freedom for this test?
- 2, 50

18 / 31

🛠 Nested F-test for Multiple Linear Regression

$F = \frac{S S M O D E L_{F u l l} - S S M O D E L_{R e d u c e d} / p}{S S E_{F u l l} / n - k - 1}$

((SSModel2 - SSModel1) / 2) /
  1860

## [1] 2.36

What are the degrees of freedom for this test?
- 2, 50

pf(2.356183, 2, 50, lower.tail = FALSE)

## [1] 0.105

18 / 31

🛠 Nested F-test for Multiple Linear Regression

An easier way

anova(model1, model2)

## Analysis of Variance Table
## 
## Model 1: Weight ~ Length + Width + Length * Width
## Model 2: Weight ~ Length + Width + I(Length^2) + I(Width^2) + Length * 
##     Width
##   Res.Df    RSS Df Sum of Sq    F Pr(>F)
## 1     52 101765                         
## 2     50  93000  2      8765 2.36   0.11

19 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear Regression20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear RegressionR2=SSModelSSTotal
20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear RegressionR2=SSModelSSTotal
R2=1−SSESSTotal
20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear RegressionR2=SSModelSSTotal
R2=1−SSESSTotal
As is, if you add a predictor this will always increase. Therefore, we have R2adj that has a small "penalty" for adding more predictors
20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear RegressionR2=SSModelSSTotal
R2=1−SSESSTotal
As is, if you add a predictor this will always increase. Therefore, we have R2adj that has a small "penalty" for adding more predictors
R2adj=1−SSE/(n−k−1)SSTotal/(n−1)
20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear RegressionR2=SSModelSSTotal
R2=1−SSESSTotal
As is, if you add a predictor this will always increase. Therefore, we have R2adj that has a small "penalty" for adding more predictors
R2adj=1−SSE/(n−k−1)SSTotal/(n−1)
SSTotaln−1=∑(y−¯y)2n−1 What is this?
20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear RegressionR2=SSModelSSTotal
R2=1−SSESSTotal
As is, if you add a predictor this will always increase. Therefore, we have R2adj that has a small "penalty" for adding more predictors
R2adj=1−SSE/(n−k−1)SSTotal/(n−1)
SSTotaln−1=∑(y−¯y)2n−1 What is this?Sample variance! S2Y

20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2 for Multiple Linear RegressionR2=SSModelSSTotal
R2=1−SSESSTotal
As is, if you add a predictor this will always increase. Therefore, we have R2adj that has a small "penalty" for adding more predictors
R2adj=1−SSE/(n−k−1)SSTotal/(n−1)
SSTotaln−1=∑(y−¯y)2n−1 What is this?Sample variance! S2Y

R2adj=1−^σ2ϵS2Y
20 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2adj for Multiple Linear RegressionR2adj=1−SSE/(n−k−1)SSTotal/(n−1)
The denominator stays the same for all models fit to the same response variable and data
the numerator actually increase when a new predictor is added to a model if the decrease in the SSE is not sufficient to offset the decrease in the error degrees of freedom. 
So R2adj can 👇 when a weak predictor is added to a model
21 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2adj for Multiple Linear Regressionglance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
glance(model2)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.986         0.985  43.1      705. 4.41e-45     6  -287.  588.  602.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
22 / 31

by Dr. Lucy D'Agostino McGowan

🛠 R2adj for Multiple Linear Regressionglance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
glance(model2)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.986         0.985  43.1      705. 4.41e-45     6  -287.  588.  602.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
so far we know what the first 6 columns are
22 / 31

by Dr. Lucy D'Agostino McGowan

Model Comparision criteriaWe are looking for reasonable ways to balance "goodness of fit" (how well the model fits the data) with "parsimony" 
23 / 31

by Dr. Lucy D'Agostino McGowan

Model Comparision criteriaWe are looking for reasonable ways to balance "goodness of fit" (how well the model fits the data) with "parsimony" 
R2adj gets at this by adding a penalty for adding variables
23 / 31

by Dr. Lucy D'Agostino McGowan

Model Comparision criteriaWe are looking for reasonable ways to balance "goodness of fit" (how well the model fits the data) with "parsimony" 
R2adj gets at this by adding a penalty for adding variables
AIC and BIC are two more methods that balance goodness of fit and parsimony
23 / 31

Log Likelihood

Both AIC and BIC are calculated using the log likelihood

$log (L) = - \frac{n}{2} [log (2 π) + log (S S E / n) + 1]$

24 / 31

Log Likelihood

Both AIC and BIC are calculated using the log likelihood

$log (L) = - \frac{n}{2} [log (2 π) + log (S S E / n) + 1]$

$log = {log}_{e}$ , log() in R

24 / 31

Log Likelihood

Both AIC and BIC are calculated using the log likelihood

$log (L) = - \frac{n}{2} [log (2 π) + log (S S E / n) + 1]$

$log = {log}_{e}$ , log() in R

glance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>

-56 / 2 * (log(2 * pi) + log(101765 / 56) + 1)

## [1] -290

24 / 31

Log Likelihood

Both AIC and BIC are calculated using the log likelihood

$log (L) = - \frac{n}{2} [log (2 π) + log (S S E / n) + 1]$

$log = {log}_{e}$ , log() in R

glance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>

-56 / 2 * (log(2 * pi) + log(101765 / 56) + 1)

## [1] -290

"goodness of fit" measure
higher log likelihood is better

24 / 31

Log Likelihood

What I want you to remember

Both AIC and BIC are calculated using the log likelihood

$log (L) = - \frac{n}{2} [log (S S E / n)] + some constant$

$log = {log}_{e}$ , log() in R
"goodness of fit" measure
higher log likelihood is better

25 / 31

by Dr. Lucy D'Agostino McGowan

AICAkaike's Information Criterion
AIC=2(k+1)−2log(L)
k is the number of predictors in the model
lower AIC values are better
26 / 31

by Dr. Lucy D'Agostino McGowan

AICAkaike's Information Criterion
AIC=2(k+1)−2log(L)
k is the number of predictors in the model
lower AIC values are better
glance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
glance(model2)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.986         0.985  43.1      705. 4.41e-45     6  -287.  588.  602.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
26 / 31

by Dr. Lucy D'Agostino McGowan

BICBayesian Information Criterion
BIC=log(n)(k+1)−2log(L)
k is the number of predictors in the model
lower BIC values are better
27 / 31

by Dr. Lucy D'Agostino McGowan

BICBayesian Information Criterion
BIC=log(n)(k+1)−2log(L)
k is the number of predictors in the model
lower BIC values are better
glance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
glance(model2)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.986         0.985  43.1      705. 4.41e-45     6  -287.  588.  602.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
27 / 31

by Dr. Lucy D'Agostino McGowan

AIC and BIC can disagree!glance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
glance(model2)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.986         0.985  43.1      705. 4.41e-45     6  -287.  588.  602.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
the penalty term is larger in BIC than in AIC
28 / 31

by Dr. Lucy D'Agostino McGowan

AIC and BIC can disagree!glance(model1)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.985         0.984  44.2     1115. 3.75e-47     4  -290.  589.  599.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
glance(model2)

## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
## 1     0.986         0.985  43.1      705. 4.41e-45     6  -287.  588.  602.
## # … with 2 more variables: deviance <dbl>, df.residual <int>
the penalty term is larger in BIC than in AIC
What to do? Both are valid, pre-specify which you are going to use before running your models in the methods section of your analysis
28 / 31

by Dr. Lucy D'Agostino McGowan

🛠 toolkit for comparing models👉  F-test👉 R2👉 AIC👉 BIC29 / 31

by Dr. Lucy D'Agostino McGowan

 First Year GPAGo to RStudio Cloud and open First Year GPA
30 / 31

by Dr. Lucy D'Agostino McGowan

31 / 31

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help