y−¯y=(^y−¯y)+(y−^y)
∑(y−¯y)2=∑(^y−¯y)2+∑(y−^y)2
y−¯y=(^y−¯y)+(y−^y)
∑(y−¯y)2=∑(^y−¯y)2+∑(y−^y)2
F=MSModelMSE
Under the null hypothesis
We can see all of these pieces using the anova()
function
lm(Weight ~ WingLength, data = Sparrows) %>% anova()
## Analysis of Variance Table## ## Response: Weight## Df Sum Sq Mean Sq F value Pr(>F)## WingLength 1 355.05 355.05 181.25 < 2.2e-16## Residuals 114 223.31 1.96
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
The probability of getting a statistic as extreme or more extreme than the observed test statistic given the null hypothesis is true
Under the null hypothesis
To calculate the p-value under the t-distribution we used pt()
. What do you think we use to calculate the p-value under the F-distribution?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
To calculate the p-value under the t-distribution we used pt()
. What do you think we use to calculate the p-value under the F-distribution?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
pf()
To calculate the p-value under the t-distribution we used pt()
. What do you think we use to calculate the p-value under the F-distribution?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
pf()
q
, df1
, and df2
. What do you think df1
and df2
are?To calculate the p-value under the t-distribution we used pt()
. What do you think we use to calculate the p-value under the F-distribution?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
pf(181.2535, 1, 114, lower.tail = FALSE)
## [1] 2.621946e-25
Why don't we multiple this p-value by 2 when we use pf()
?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
pf(181.2535, 1, 114, lower.tail = FALSE)
## [1] 2.621946e-25
Under the null hypothesis
Under the null hypothesis
Under the null hypothesis
Under the null hypothesis
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
lm(Weight ~ WingLength, data = Sparrows) %>% tidy()
## # A tibble: 2 x 5## term estimate std.error statistic p.value## <chr> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) 1.37 0.957 1.43 1.56e- 1## 2 WingLength 0.467 0.0347 13.5 2.62e-25
What is the F-test testing?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
What is the F-test testing?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
What is the F-test testing?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
What is the F-test testing?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
How are the test statistics related between the F and the t?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
lm(Weight ~ WingLength, data = Sparrows) %>% tidy()
## # A tibble: 2 x 5## term estimate std.error statistic p.value## <chr> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) 1.37 0.957 1.43 1.56e- 1## 2 WingLength 0.467 0.0347 13.5 2.62e-25
How are the test statistics related between the F and the t?
lm(Weight ~ WingLength, data = Sparrows) %>% glance()
## # A tibble: 1 x 11## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC## <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>## 1 0.614 0.611 1.40 181. 2.62e-25 2 -203. 411. 419.## # … with 2 more variables: deviance <dbl>, df.residual <int>
lm(Weight ~ WingLength, data = Sparrows) %>% tidy()
## # A tibble: 2 x 5## term estimate std.error statistic p.value## <chr> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) 1.37 0.957 1.43 1.56e- 1## 2 WingLength 0.467 0.0347 13.5 2.62e-25
13.5^2
## [1] 182.25
## [1] 182.25
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |