Diamonds
Diamonds
glimpse()
function to see all of your variables and their typesglimpse()
function to see all of your variables and their typesdata("PorschePrice")glimpse(PorschePrice)
## Observations: 30## Variables: 3## $ Price <dbl> 69.4, 56.9, 49.9, 47.4, 42.9, 36.9, 83.0, 72.9, 69.9, 67…## $ Age <int> 3, 3, 2, 4, 4, 6, 0, 0, 2, 0, 2, 2, 4, 3, 10, 11, 4, 4, …## $ Mileage <dbl> 21.50, 43.00, 19.90, 36.00, 44.00, 49.80, 1.30, 0.67, 13…
glimpse()
function to see all of your variables and their typesdata("PorschePrice")glimpse(PorschePrice)
## Observations: 30## Variables: 3## $ Price <dbl> 69.4, 56.9, 49.9, 47.4, 42.9, 36.9, 83.0, 72.9, 69.9, 67…## $ Age <int> 3, 3, 2, 4, 4, 6, 0, 0, 2, 0, 2, 2, 4, 3, 10, 11, 4, 4, …## $ Mileage <dbl> 21.50, 43.00, 19.90, 36.00, 44.00, 49.80, 1.30, 0.67, 13…
glimpse()
function to see all of your variables and their typesdata("Diamonds")glimpse(Diamonds)
## Observations: 351## Variables: 6## $ Carat <dbl> 1.08, 0.31, 0.31, 0.32, 0.33, 0.33, 0.35, 0.35, 0.37,…## $ Color <fct> E, F, H, F, D, G, F, F, F, D, E, F, D, D, F, F, D, D,…## $ Clarity <fct> VS1, VVS1, VS1, VVS1, IF, VVS1, VS1, VS1, VVS1, IF, V…## $ Depth <dbl> 68.6, 61.9, 62.1, 60.8, 60.8, 61.5, 62.5, 62.3, 61.4,…## $ PricePerCt <dbl> 6693.3, 3159.0, 1755.0, 3159.0, 4758.8, 2895.8, 2457.…## $ TotalPrice <dbl> 7228.8, 979.3, 544.1, 1010.9, 1570.4, 955.6, 860.0, 8…
glimpse()
function to see all of your variables and their typesdata("Diamonds")glimpse(Diamonds)
## Observations: 351## Variables: 6## $ Carat <dbl> 1.08, 0.31, 0.31, 0.32, 0.33, 0.33, 0.35, 0.35, 0.37,…## $ Color <fct> E, F, H, F, D, G, F, F, F, D, E, F, D, D, F, F, D, D,…## $ Clarity <fct> VS1, VVS1, VS1, VVS1, IF, VVS1, VS1, VS1, VVS1, IF, V…## $ Depth <dbl> 68.6, 61.9, 62.1, 60.8, 60.8, 61.5, 62.5, 62.3, 61.4,…## $ PricePerCt <dbl> 6693.3, 3159.0, 1755.0, 3159.0, 4758.8, 2895.8, 2457.…## $ TotalPrice <dbl> 7228.8, 979.3, 544.1, 1010.9, 1570.4, 955.6, 860.0, 8…
glimpse()
function to see all of your variables and their typesdata("Diamonds")glimpse(Diamonds)
## Observations: 351## Variables: 6## $ Carat <dbl> 1.08, 0.31, 0.31, 0.32, 0.33, 0.33, 0.35, 0.35, 0.37,…## $ Color <fct> E, F, H, F, D, G, F, F, F, D, E, F, D, D, F, F, D, D,…## $ Clarity <fct> VS1, VVS1, VS1, VVS1, IF, VVS1, VS1, VS1, VVS1, IF, V…## $ Depth <dbl> 68.6, 61.9, 62.1, 60.8, 60.8, 61.5, 62.5, 62.3, 61.4,…## $ PricePerCt <dbl> 6693.3, 3159.0, 1755.0, 3159.0, 4758.8, 2895.8, 2457.…## $ TotalPrice <dbl> 7228.8, 979.3, 544.1, 1010.9, 1570.4, 955.6, 860.0, 8…
fct
: "factor" this is a type of categorical variableglimpse()
function to see all of your variables and their typesglimpse(starwars)
## Observations: 87## Variables: 5## $ name <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "L…## $ height <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, …## $ mass <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.…## $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "bro…## $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "lig…
glimpse()
function to see all of your variables and their typesglimpse(starwars)
## Observations: 87## Variables: 5## $ name <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "L…## $ height <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, …## $ mass <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.…## $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "bro…## $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "lig…
chr
: "character" this is a type of categorical variableAn indicator variable uses two values, usually 0 and 1, to indicate whether a data case does (1) or does not (0) belong to a specific category
What does this line of code do?
Diamonds <- Diamonds %>% mutate( ColorD = ifelse(Color == "D", 1, 0), ColorE = ifelse(Color == "E", 1, 0), ColorF = ifelse(Color == "F", 1, 0), ColorG = ifelse(Color == "G", 1, 0), ColorH = ifelse(Color == "H", 1, 0), ColorI = ifelse(Color == "I", 1, 0), ColorJ = ifelse(Color == "J", 1, 0) )
What does this line of code do?
Diamonds <- Diamonds %>% mutate( ColorD = ifelse(Color == "D", 1, 0), ColorE = ifelse(Color == "E", 1, 0), ColorF = ifelse(Color == "F", 1, 0), ColorG = ifelse(Color == "G", 1, 0), ColorH = ifelse(Color == "H", 1, 0), ColorI = ifelse(Color == "I", 1, 0), ColorJ = ifelse(Color == "J", 1, 0) )
What if I wanted to model the relationship between TotalPrice
and Color
?
Why is ColorJ
NA
?
lm(TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ColorH + ColorI + ColorJ, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ## ColorH + ColorI + ColorJ, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ColorJ ## 6732 5704 NA
Why is ColorJ
NA
?
lm(TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ColorH + ColorI + ColorJ, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ## ColorH + ColorI + ColorJ, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ColorJ ## 6732 5704 NA
k
categories, always include k-1
Why is ColorJ
NA
?
lm(TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ColorH + ColorI + ColorJ, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ## ColorH + ColorI + ColorJ, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ColorJ ## 6732 5704 NA
k
categories, always include k-1
What is the reference category?
lm(TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ColorH + ColorI, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ## ColorH + ColorI, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ## 6732 5704
What is the reference category?
lm(TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ColorH + ColorI, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ## ColorH + ColorI, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ## 6732 5704
D
compared to color J
increases the expected total price by 3632.What is the reference category?
lm(TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ColorH + ColorI, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ## ColorH + ColorI, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ## 6732 5704
D
compared to color J
increases the expected total price by 3632.E
compared to color J
increases the expected total price by 2423What is the reference category?
lm(TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ColorH + ColorI, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG + ## ColorH + ColorI, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ## 6732 5704
D
compared to color J
increases the expected total price by 3632.F
?lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorE ColorF ColorG ColorH ## 5569 -1209 3592 3990 3100 ## ColorI ColorJ ## 2071 -3632
What is the reference category?
lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorE ColorF ColorG ColorH ## 5569 -1209 3592 3990 3100 ## ColorI ColorJ ## 2071 -3632
What is the reference category?
lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorE ColorF ColorG ColorH ## 5569 -1209 3592 3990 3100 ## ColorI ColorJ ## 2071 -3632
E
now?What is the reference category?
lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorE ColorF ColorG ColorH ## 5569 -1209 3592 3990 3100 ## ColorI ColorJ ## 2071 -3632
E
now?What is the reference category?
lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorE ColorF ColorG ColorH ## 5569 -1209 3592 3990 3100 ## ColorI ColorJ ## 2071 -3632
E
now?What is the reference category?
lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorE ColorF ColorG ColorH ## 5569 -1209 3592 3990 3100 ## ColorI ColorJ ## 2071 -3632
E
now?Source: forcats.tidyverse.org
levels(Diamonds$Color)
## [1] "D" "E" "F" "G" "H" "I" "J"
levels(Diamonds$Color)
## [1] "D" "E" "F" "G" "H" "I" "J"
new_levels <- c("J", "D", "E", "F", "G", "H", "I")Diamonds <- Diamonds %>% mutate(Color = fct_relevel(Color, new_levels))
levels(Diamonds$Color)
## [1] "J" "D" "E" "F" "G" "H" "I"
lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ## 6732 5704
What is the reference category?
lm(TotalPrice ~ Color, data = Diamonds)
## ## Call:## lm(formula = TotalPrice ~ Color, data = Diamonds)## ## Coefficients:## (Intercept) ColorD ColorE ColorF ColorG ## 1936 3632 2423 7224 7623 ## ColorH ColorI ## 6732 5704
data("ICU")lm(Pulse ~ Emergency, data = ICU)
## ## Call:## lm(formula = Pulse ~ Emergency, data = ICU)## ## Coefficients:## (Intercept) Emergency ## 91.11 10.63
data("ICU")lm(Pulse ~ Emergency, data = ICU)
## ## Call:## lm(formula = Pulse ~ Emergency, data = ICU)## ## Coefficients:## (Intercept) Emergency ## 91.11 10.63
data("ICU")lm(Pulse ~ Emergency, data = ICU)
## ## Call:## lm(formula = Pulse ~ Emergency, data = ICU)## ## Coefficients:## (Intercept) Emergency ## 91.11 10.63
Diamonds
Diamonds
Diamonds
Diamonds
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |