Confounding and Variable Transformations

Confounding and Variable Transformations1 / 32

Adjusting for confounders

What is the relationship between average SAT scores and average teacher salaries?

Show entries

Search:

	state	expend	ratio	salary	frac	verbal	math	sat
1	Alabama	4.405	17.2	31.144	8	491	538	1029
2	Alaska	8.963	17.6	47.951	47	445	489	934
3	Arizona	4.778	19.3	32.175	27	448	496	944
4	Arkansas	4.459	17.1	28.934	6	482	523	1005
5	California	4.992	24	41.078	45	417	485	902

Showing 1 to 5 of 50 entries

Previous1 2 3 4 5…10Next

2 / 32

Adjusting for confounders

What is the relationship between average SAT scores and average teacher salaries?

Show entries

Search:

	state	expend	ratio	salary	frac	verbal	math	sat
1	Alabama	4.405	17.2	31.144	8	491	538	1029
2	Alaska	8.963	17.6	47.951	47	445	489	934
3	Arizona	4.778	19.3	32.175	27	448	496	944
4	Arkansas	4.459	17.1	28.934	6	482	523	1005
5	California	4.992	24	41.078	45	417	485	902

Showing 1 to 5 of 50 entries

Previous1 2 3 4 5…10Next

Are we doing inference or prediction?

2 / 32

Adjusting for confounders

I fit a linear model for $\hat{s a t} = {\hat{β}}_{0} + {\hat{β}}_{1} s a l a r y$

## # A tibble: 2 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)  1159.       57.7      20.1  5.13e-25
## 2 salary         -5.54      1.63     -3.39 1.39e- 3

3 / 32

Adjusting for confounders

I fit a linear model for $\hat{s a t} = {\hat{β}}_{0} + {\hat{β}}_{1} s a l a r y$

## # A tibble: 2 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)  1159.       57.7      20.1  5.13e-25
## 2 salary         -5.54      1.63     -3.39 1.39e- 3

How do we interpret this result?

3 / 32

Adjusting for confounders

There is a third variable, the fraction of students that took the SAT in that state. It is grouped as "Low", "Medium", and, "High".

## # A tibble: 4 x 5
##   term          estimate std.error statistic  p.value
##   <chr>            <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)     852.      38.9       21.9  5.56e-26
## 2 salary            1.09     0.988      1.10 2.76e- 1
## 3 frac_groupLOW   150.      12.8       11.7  2.09e-15
## 4 frac_groupMED    38.6     14.1        2.75 8.59e- 3

4 / 32

Adjusting for confounders

There is a third variable, the fraction of students that took the SAT in that state. It is grouped as "Low", "Medium", and, "High".

## # A tibble: 4 x 5
##   term          estimate std.error statistic  p.value
##   <chr>            <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)     852.      38.9       21.9  5.56e-26
## 2 salary            1.09     0.988      1.10 2.76e- 1
## 3 frac_groupLOW   150.      12.8       11.7  2.09e-15
## 4 frac_groupMED    38.6     14.1        2.75 8.59e- 3

What is the referent category?

4 / 32

Adjusting for confounders

There is a third variable, the fraction of students that took the SAT in that state. It is grouped as "Low", "Medium", and, "High".

## # A tibble: 4 x 5
##   term          estimate std.error statistic  p.value
##   <chr>            <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)     852.      38.9       21.9  5.56e-26
## 2 salary            1.09     0.988      1.10 2.76e- 1
## 3 frac_groupLOW   150.      12.8       11.7  2.09e-15
## 4 frac_groupMED    38.6     14.1        2.75 8.59e- 3

What is the referent category?
How do you interpret the $\hat{β}$ for frac_groupLOW?

4 / 32

Adjusting for confounders

There is a third variable, the fraction of students that took the SAT in that state. It is grouped as "Low", "Medium", and, "High".

## # A tibble: 4 x 5
##   term          estimate std.error statistic  p.value
##   <chr>            <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)     852.      38.9       21.9  5.56e-26
## 2 salary            1.09     0.988      1.10 2.76e- 1
## 3 frac_groupLOW   150.      12.8       11.7  2.09e-15
## 4 frac_groupMED    38.6     14.1        2.75 8.59e- 3

What is the referent category?
How do you interpret the $\hat{β}$ for frac_groupLOW?
How do you interpret the $\hat{β}$ for salary now?

4 / 32

$\hat{β}$ interpretation in multiple linear regression

The coefficient for $x$ is $\hat{β}$ (95% CI: $L B_{\hat{β}}, U B_{\hat{β}}$ ). A one-unit increase in $x$ yields an expected increase in y of $\hat{β}$ , holding all other variables constant.

5 / 32

$\hat{β}$ interpretation in multiple linear regression

The coefficient for average salary is 1.09 (95% CI: -0.90, 3.08). A one-unit increase in average salary yields an expected increase in average SAT score of 1.09, holding the fraction of students that took the SAT constant.

6 / 32

Adjusting for confounders

7 / 32

Adjusting for confoundrs

8 / 32

Adjusting for confoundrs

What is this called? Where the direction reverses?

8 / 32

Adjusting for confoundrs

What is this called? Where the direction reverses?
Notice here the lines are parallel so holding the group constant, this is the effect we see.

8 / 32

Adjusting for confoundrs

What is this called? Where the direction reverses?
Notice here the lines are parallel so holding the group constant, this is the effect we see.
😱 what if the lines aren't parallel?

8 / 32

Interactions

Data looking at the growth rate for kids

Show entries

Search:

	Height	Weight	Age	Sex	Race
1	67.8	166	210	0	1
2	63	93	144	1	0
3	50.1	54	119	0	0
4	55.7	69	130	1	0
5	63.2	115	157	0	0
6	48.8	52	102	0	0
7	63.8	108	198	1	0
8	61.3	89	155	0	0
9	61.1	118	199	1	0
10	54.7	80	134	0	0

Showing 1 to 10 of 198 entries

Previous1 2 3 4 5…20Next

9 / 32

Interactions

10 / 32

Interactions

Will ${\hat{β}}_{a g e}$ be positive or negative?

10 / 32

Interactions

Let's look at this relationship split by sex (blue: Girl, black: Boy)

11 / 32

Interactions

Let's look at this relationship split by sex (blue: Girl, black: Boy)

12 / 32

Interactions

Let's look at this relationship split by sex (blue: Girl, black: Boy)

😱 the lines cross! That means there is an interaction, that is the slopes differ based on the group

12 / 32

Interactions

Let's look at this relationship split by sex (blue: Girl, black: Boy)

13 / 32

Interactions

Let's look at this relationship split by sex (blue: Girl, black: Boy)

What is the equation for this relationship?

13 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

lm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812

14 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

lm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812

What does this model become for boys (When Sex = 0)

14 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

lm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812

What does this model become for boys (When Sex = 0)
- $W e i g h t = β_{0} + β_{1} A g e + ϵ$

14 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

lm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812

What does this model become for boys (When Sex = 0)
- $W e i g h t = β_{0} + β_{1} A g e + ϵ$
What does this model become for girls (When Sex = 1)

14 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

lm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812

What does this model become for boys (When Sex = 0)
- $W e i g h t = β_{0} + β_{1} A g e + ϵ$
What does this model become for girls (When Sex = 1)
- $W e i g h t = β_{0} + β_{1} A g e + β_{2} 1 + β_{3} A g e \times 1 + ϵ$

14 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

lm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812

What does this model become for boys (When Sex = 0)
- $W e i g h t = β_{0} + β_{1} A g e + ϵ$
What does this model become for girls (When Sex = 1)
- $W e i g h t = β_{0} + β_{1} A g e + β_{2} 1 + β_{3} A g e \times 1 + ϵ$
- $W e i g h t = (β_{0} + β_{2}) + (β_{1} + β_{3}) A g e + ϵ$

14 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

lm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812

What does this model become for boys (When Sex = 0)
- $W e i g h t = β_{0} + β_{1} A g e + ϵ$
What does this model become for girls (When Sex = 1)
- $W e i g h t = β_{0} + β_{1} A g e + β_{2} 1 + β_{3} A g e \times 1 + ϵ$
- $W e i g h t = (β_{0} + β_{2}) + (β_{1} + β_{3}) A g e + ϵ$
How do you interpret ${\hat{β}}_{0}$ now?

14 / 32

by Dr. Lucy D'Agostino McGowan

Interactionslm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812
What does this model become for boys (When Sex = 0)Weight=β0+β1Age+ϵWeight=β0+β1Age+ϵ

What does this model become for girls (When Sex = 1)Weight=β0+β1Age+β21+β3Age×1+ϵWeight=β0+β1Age+β21+β3Age×1+ϵ
Weight=(β0+β2)+(β1+β3)Age+ϵWeight=(β0+β2)+(β1+β3)Age+ϵ

How do you interpret ^β2β^2 now?
15 / 32

by Dr. Lucy D'Agostino McGowan

Interactionslm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812
What does this model become for boys (When Sex = 0)Weight=β0+β1Age+ϵWeight=β0+β1Age+ϵ

What does this model become for girls (When Sex = 1)Weight=β0+β1Age+β21+β3Age×1+ϵWeight=β0+β1Age+β21+β3Age×1+ϵ
Weight=(β0+β2)+(β1+β3)Age+ϵWeight=(β0+β2)+(β1+β3)Age+ϵ

How do you interpret ^β2β^2 now?The difference in intercepts between boys and girls

15 / 32

by Dr. Lucy D'Agostino McGowan

Interactionslm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812
What does this model become for boys (When Sex = 0)Weight=β0+β1Age+ϵWeight=β0+β1Age+ϵ

What does this model become for girls (When Sex = 1)Weight=β0+β1Age+β21+β3Age×1+ϵWeight=β0+β1Age+β21+β3Age×1+ϵ
Weight=(β0+β2)+(β1+β3)Age+ϵWeight=(β0+β2)+(β1+β3)Age+ϵ

How do you interpret ^β3β^3 now?
16 / 32

by Dr. Lucy D'Agostino McGowan

Interactionslm(Weight ~ Age + Sex + Age * Sex, data = Kids198)

## 
## Call:
## lm(formula = Weight ~ Age + Sex + Age * Sex, data = Kids198)
## 
## Coefficients:
## (Intercept)          Age          Sex      Age:Sex  
##    -33.6925       0.9087      31.8506      -0.2812
What does this model become for boys (When Sex = 0)Weight=β0+β1Age+ϵWeight=β0+β1Age+ϵ

What does this model become for girls (When Sex = 1)Weight=β0+β1Age+β21+β3Age×1+ϵWeight=β0+β1Age+β21+β3Age×1+ϵ
Weight=(β0+β2)+(β1+β3)Age+ϵWeight=(β0+β2)+(β1+β3)Age+ϵ

How do you interpret ^β3β^3 now?How much the slope changes as we move from the regression line for boys to that for girls

16 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

Hypothesis testing: What if you want to test whether the slope is different between groups?
Is the growth rate different for boys and girls?
What is $H_{0}$ ?

17 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

Hypothesis testing: What if you want to test whether the slope is different between groups?
Is the growth rate different for boys and girls?
What is $H_{0}$ ?
- $H_{0} : β_{3} = 0$

17 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

Hypothesis testing: What if you want to test whether the slope is different between groups?
Is the growth rate different for boys and girls?
What is $H_{0}$ ?
- $H_{0} : β_{3} = 0$
What is $H_{A}$ ?

17 / 32

Interactions

$W e i g h t = β_{0} + β_{1} A g e + β_{2} G i r l + β_{3} A g e \times G i r l + ϵ$

Hypothesis testing: What if you want to test whether the slope is different between groups?
Is the growth rate different for boys and girls?
What is $H_{0}$ ?
- $H_{0} : β_{3} = 0$
What is $H_{A}$ ?
- $H_{A} : β_{3} \neq 0$

17 / 32

by Dr. Lucy D'Agostino McGowan

Interactionslm(Weight ~ Age + Sex + Age * Sex, data = Kids198) %>%
  tidy(conf.int = TRUE)

## # A tibble: 4 x 7
##   term        estimate std.error statistic  p.value conf.low conf.high
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
## 1 (Intercept)  -33.7     10.0        -3.37 9.17e- 4  -53.4     -14.0  
## 2 Age            0.909    0.0611     14.9  6.47e-34    0.788     1.03 
## 3 Sex           31.9     13.2         2.41 1.71e- 2    5.73     58.0  
## 4 Age:Sex       -0.281    0.0816     -3.44 7.00e- 4   -0.442    -0.120
18 / 32

by Dr. Lucy D'Agostino McGowan

Interactionslm(Weight ~ Age + Sex + Age * Sex, data = Kids198) %>%
  tidy(conf.int = TRUE)

## # A tibble: 4 x 7
##   term        estimate std.error statistic  p.value conf.low conf.high
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
## 1 (Intercept)  -33.7     10.0        -3.37 9.17e- 4  -53.4     -14.0  
## 2 Age            0.909    0.0611     14.9  6.47e-34    0.788     1.03 
## 3 Sex           31.9     13.2         2.41 1.71e- 2    5.73     58.0  
## 4 Age:Sex       -0.281    0.0816     -3.44 7.00e- 4   -0.442    -0.120
What is the result of our hypothesis test?
18 / 32

$\hat{β}$ interpretation for interactions between $x$ and a binary indicator $I$

The coefficient for the interaction between $x$ and $I$ is $\hat{β}$ (95% CI: $L B_{\hat{β}}, U B_{\hat{β}}$ ). This means that the effect of $x$ on $y$ differs by $\hat{β}$ when $I = 1$ compared to $I = 0$ holding all other variables constant*.

19 / 32

$\hat{β}$ interpretation for interactions between $x$ and a binary indicator $I$

You must include this line if there are additional variables in your model.

19 / 32

$\hat{β}$ interpretation for interactions between $x$ and a binary indicator $I$

The coefficient for the interaction between Age and Sex is -0.28 (95% CI: -0.44, -0.12). This means that the effect of Age on Weight lower by 0.28 among girls compared to boys.

20 / 32

by Dr. Lucy D'Agostino McGowan

Non-linear relationshipsSometimes the relationships between the outcome yy and xx variables are nonlinear. 
We can use polynomials to address this!
Returning to the Diamonds data, let's say we are interested in predicting Total Price from the Carats.
21 / 32

by Dr. Lucy D'Agostino McGowan

Non-linear relationshipsSometimes the relationships between the outcome yy and xx variables are nonlinear. 
We can use polynomials to address this!
Returning to the Diamonds data, let's say we are interested in predicting Total Price from the Carats.Is this an example of inference or prediction?

21 / 32

Non-linear relationships

22 / 32

Non-linear relationships

lm(TotalPrice ~ Carat, data = Diamonds)

23 / 32

Non-linear relationships

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds)

24 / 32

Non-linear relationships

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds)

What is the equation for this relationship?

24 / 32

Interpreting $\hat{β}$ s in the presence of polynomials

$T o t a l P r i c e = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$

What is the interpretation of ${\hat{β}}_{1}$ ?

25 / 32

Interpreting $\hat{β}$ s in the presence of polynomials

$T o t a l P r i c e = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$

What is the interpretation of ${\hat{β}}_{1}$ ?
Typically, in multiple linear regression, the interpretation of ${\hat{β}}_{i}$ is: a one-unit change in $x$ yields an expected change in $y$ of ${\hat{β}}_{i}$ holding all other variables constant.

25 / 32

Interpreting $\hat{β}$ s in the presence of polynomials

$T o t a l P r i c e = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$

What is the interpretation of ${\hat{β}}_{1}$ ?
Typically, in multiple linear regression, the interpretation of ${^β}_{i}$ is: a one-unit change in $x$ yields an expected change in $y$ of ${^β}_{i}$ holding all other variables constant.
- What does it mean to see a change in Caret holding Carat $^{2}$ constant?

25 / 32

Interpreting $\hat{β}$ s in the presence of polynomials

$T o t a l P r i c e = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$

What is the interpretation of ${\hat{β}}_{1}$ ?
Typically, in multiple linear regression, the interpretation of ${^β}_{i}$ is: a one-unit change in $x$ yields an expected change in $y$ of ${^β}_{i}$ holding all other variables constant.
- What does it mean to see a change in Caret holding Carat $^{2}$ constant?
When you have a polynomial term, you need to specify the values you are changing between, since the change is no longer constant across all values of $x$ .

25 / 32

Interpreting $\hat{β}$ in the presence of polynomials

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds) %>%
  tidy()

## # A tibble: 3 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    -523.      466.     -1.12 2.63e- 1
## 2 Carat          2386.      753.      3.17 1.66e- 3
## 3 I(Carat^2)     4498.      263.     17.1  5.09e-48

What is the expected change in TotalPrice for a one-unit change in Carat, changing from 0.8 to 1.8?

26 / 32

Interpreting $\hat{β}$ in the presence of polynomials

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds) %>%
  tidy()

## # A tibble: 3 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    -523.      466.     -1.12 2.63e- 1
## 2 Carat          2386.      753.      3.17 1.66e- 3
## 3 I(Carat^2)     4498.      263.     17.1  5.09e-48

What is the expected change in TotalPrice for a one-unit change in Carat, changing from 0.8 to 1.8?

(-522.7 + 2386 * 1.8 + 4498.2 * 1.8^2) - 
  (-522.7 + 2386 * 0.8 + 4498.2 * 0.8^2)

## [1] 14081.32

26 / 32

Interpreting $\hat{β}$ in the presence of polynomials

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds) %>%
  tidy()

## # A tibble: 3 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    -523.      466.     -1.12 2.63e- 1
## 2 Carat          2386.      753.      3.17 1.66e- 3
## 3 I(Carat^2)     4498.      263.     17.1  5.09e-48

What is the expected change in TotalPrice for a one-unit change in Carat, changing from 0.8 to 1.8?

(-522.7 + 2386 * 1.8 + 4498.2 * 1.8^2) - 
  (-522.7 + 2386 * 0.8 + 4498.2 * 0.8^2)

## [1] 14081.32

2386 * (1.8 - 0.8) + 
  4498.2 * (1.8^2 - 0.8^2)

## [1] 14081.32

26 / 32

Interpreting $\hat{β}$ in the presence of polynomials

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds) %>%
  tidy()

## # A tibble: 3 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    -523.      466.     -1.12 2.63e- 1
## 2 Carat          2386.      753.      3.17 1.66e- 3
## 3 I(Carat^2)     4498.      263.     17.1  5.09e-48

What is the expected change in TotalPrice for a one-unit change in Carat, changing from 1.8 to 2.8?

27 / 32

Interpreting $\hat{β}$ in the presence of polynomials

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds) %>%
  tidy()

## # A tibble: 3 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    -523.      466.     -1.12 2.63e- 1
## 2 Carat          2386.      753.      3.17 1.66e- 3
## 3 I(Carat^2)     4498.      263.     17.1  5.09e-48

What is the expected change in TotalPrice for a one-unit change in Carat, changing from 1.8 to 2.8?

2386 * (2.8 - 1.8) + 4498.2 * (2.8^2 - 1.8^2)

## [1] 23077.72

27 / 32

Interpreting $\hat{β}$ in the presence of polynomials

lm(TotalPrice ~ Carat + I(Carat^2), data = Diamonds) %>%
  tidy()

## # A tibble: 3 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    -523.      466.     -1.12 2.63e- 1
## 2 Carat          2386.      753.      3.17 1.66e- 3
## 3 I(Carat^2)     4498.      263.     17.1  5.09e-48

What is the expected change in TotalPrice for a one-unit change in Carat, changing from 1.8 to 2.8?

2386  (2.8 - 1.8) + 4498.2  (2.8^2 - 1.8^2)

## [1] 23077.72

Can we talk about ${\hat{β}}_{1}$ and ${\hat{β}}_{2}$ in the context of a one-unit change in Carat?

27 / 32

by Dr. Lucy D'Agostino McGowan

Interpreting ^ββ^ in the presence of polynomials^ββ^ coefficients that are transformations of the same xx variable must be interpreted together
28 / 32

by Dr. Lucy D'Agostino McGowan

Interpreting ^ββ^ in the presence of polynomials^ββ^ coefficients that are transformations of the same xx variable must be interpreted together
You must first choose to values of xx to change between, and then report the change. 
28 / 32

by Dr. Lucy D'Agostino McGowan

Interpreting ^ββ^ in the presence of polynomials^ββ^ coefficients that are transformations of the same xx variable must be interpreted together
You must first choose to values of xx to change between, and then report the change. 
A sensible choice for the two xx values can be the 25th% quantile and the 75th% quantile.
28 / 32

General $\hat{β}$ interpretation with quadratic terms

The linear term in the model for $x$ has a coefficient of ${\hat{β}}_{1}$ (95% CI: $(L B_{{\hat{β}}_{1}}, U B_{{\hat{β}}_{1}})$ ). The quadratic term in the model for $x$ has a coefficient of ${\hat{β}}_{2}$ (95% CI: $(L B_{{\hat{β}}_{2}}, U B_{{\hat{β}}_{2}})$ ). A change in $x$ from $a$ to $b$ yields an expected change in $y$ of ${\hat{β}}_{1} (b - a) + {\hat{β}}_{2} (b^{2} - a^{2})$ holding all other variables constant*.

29 / 32

General $\hat{β}$ interpretation with quadratic terms

You must include this line if there are additional variables in your model.

29 / 32

Specific $\hat{β}$ interpretation for $y = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$ model

The linear term in the model for Carat has a coefficient of 2386 (95% CI: $(906, 3866)$ ). The quadratic term in the model for Carat has a coefficient of $4498$ (95% CI: $(3981, 5016)$ ). A change in Carat from $0.7$ to $1.24$ yields an expected change in TotalPrice of $6000.5$ .

30 / 32

Specific $\hat{β}$ interpretation for $y = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$ model

Why didn't I say holding all other variables constant?

30 / 32

by Dr. Lucy D'Agostino McGowan

Take awaysThe interpretation of ^ββ^ in multiple linear regressionA one-unit change in xx yields an expected change in yy of ^ββ^ holding all other included variables constant

31 / 32

by Dr. Lucy D'Agostino McGowan

Take awaysThe interpretation of ^ββ^ in multiple linear regressionA one-unit change in xx yields an expected change in yy of ^ββ^ holding all other included variables constant

If the slope differs between groups (the lines cross in a scatterplot), an interaction is present
31 / 32

by Dr. Lucy D'Agostino McGowan

Take awaysThe interpretation of ^ββ^ in multiple linear regressionA one-unit change in xx yields an expected change in yy of ^ββ^ holding all other included variables constant

If the slope differs between groups (the lines cross in a scatterplot), an interaction is present
You can include polynomial terms to address non-linear relationships
31 / 32

by Dr. Lucy D'Agostino McGowan

Take awaysThe interpretation of ^ββ^ in multiple linear regressionA one-unit change in xx yields an expected change in yy of ^ββ^ holding all other included variables constant

If the slope differs between groups (the lines cross in a scatterplot), an interaction is present
You can include polynomial terms to address non-linear relationshipsThe coefficients for a polynomial must be interpreted together

31 / 32

by Dr. Lucy D'Agostino McGowan

 DiamondsGo to RStudio Cloud and open Diamonds
Fit the model  TotalPrice=β0+β1Carat+β2Carat2+β3Color+ϵTotalPrice=β0+β1Carat+β2Carat2+β3Color+ϵ
Find the 0.25 quantile and 0.75 quantile of Carat
What is the interpretation of ^β1β^1, ^β2β^2, and ^β3β^3?
32 / 32

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Confounding and Variable Transformations

Adjusting for confounders

Adjusting for confounders

Adjusting for confounders

Adjusting for confounders

Adjusting for confounders

Adjusting for confounders

Adjusting for confounders

Adjusting for confounders

^ββ^ interpretation in multiple linear regression

^ββ^ interpretation in multiple linear regression

Adjusting for confounders

Adjusting for confoundrs

Adjusting for confoundrs

Adjusting for confoundrs

Adjusting for confoundrs

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

Interactions

^ββ^ interpretation for interactions between xx and a binary indicator II

^ββ^ interpretation for interactions between xx and a binary indicator II

^ββ^ interpretation for interactions between xx and a binary indicator II

Non-linear relationships

Non-linear relationships

Non-linear relationships

Non-linear relationships

Non-linear relationships

Non-linear relationships

Interpreting ^ββ^s in the presence of polynomials

Interpreting ^ββ^s in the presence of polynomials

Interpreting ^ββ^s in the presence of polynomials

Interpreting ^ββ^s in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

Interpreting ^ββ^ in the presence of polynomials

General ^ββ^ interpretation with quadratic terms

General ^ββ^ interpretation with quadratic terms

Specific ^ββ^ interpretation for y=β0+β1Carat+β2Carat2+ϵy=β0+β1Carat+β2Carat2+ϵ model

Specific ^ββ^ interpretation for y=β0+β1Carat+β2Carat2+ϵy=β0+β1Carat+β2Carat2+ϵ model

Take aways

Take aways

Take aways

Take aways

Diamonds

Adjusting for confounders

Help

$\hat{β}$ interpretation in multiple linear regression

$\hat{β}$ interpretation in multiple linear regression

$\hat{β}$ interpretation for interactions between $x$ and a binary indicator $I$

$\hat{β}$ interpretation for interactions between $x$ and a binary indicator $I$

$\hat{β}$ interpretation for interactions between $x$ and a binary indicator $I$

Interpreting $\hat{β}$ s in the presence of polynomials

Interpreting $\hat{β}$ s in the presence of polynomials

Interpreting $\hat{β}$ s in the presence of polynomials

Interpreting $\hat{β}$ s in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

Interpreting $\hat{β}$ in the presence of polynomials

General $\hat{β}$ interpretation with quadratic terms

General $\hat{β}$ interpretation with quadratic terms

Specific $\hat{β}$ interpretation for $y = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$ model

Specific $\hat{β}$ interpretation for $y = β_{0} + β_{1} C a r a t + β_{2} C a r a t^{2} + ϵ$ model

`Diamonds`