class: center, middle, inverse, title-slide # Working Smarter, Not Harder ## Reproducible Analysis + Visualization with R
###
Matt Worthington
| M.Ed./M.PAff ### Sr. Project Manager for Data Initiatives, LBJ School of Public Affairs ### August 19, 2021 --- class: middle
.pull-left[ # About Me ] .pull-right[ * .orange[**Personal Background**]: Born and raised in San Antonio, Texas. Live in Austin with my wife and three kids. * .orange[**Education Backgrounds**]: English Studies, Special Education, and Public Policy * .orange[**Professional Backgrounds**]: Public School Teacher, School District Administrator, and Data Scientist. ] --- background-image: url('assets/images/data_life_cycle.png') background-size: cover class: center, bottom, inverse --- background-image: url('assets/images/hermione_spell_class.png') background-size: cover class: center, bottom, inverse ## .blue[How to understand R if you are new...] --- background-image: url('assets/images/hp_vs_r.png') background-size: cover class: center, bottom, inverse --- background-image: url('assets/images/r_vs_rstudio.png') background-size: cover class: center, bottom, inverse ## .blue[Understanding R vs. RStudio...] --- class: middle .pull-left[ ## How R Can Make You Feel Sometimes... ] .pull-right[ <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">src/ : folder for scripts<br>data/ : original data pulled from database<br>output/ : intermediate RDS data objects needed for Rmd<br>analysis/: Rmd files and HTML output<br>doc/ : any long-form notes to self or documentation<br>ext/: external images or other random files I want to keep in proj</p>— Emily Riederer (@EmilyRiederer) <a href="https://twitter.com/allison_horst/status/1304134105643118592?s=20">September 10, 2018</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- background-image: url('assets/images/rstudio_hex.png') background-size: cover class: center, bottom, inverse --- background-image: url('assets/images/rstudio_screenshot.png') background-size: cover class: center, bottom, inverse --- background-image: url('assets/images/tidyverse_hex.png') background-size: cover class: center, bottom, inverse --- background-image: url('assets/images/tidyverse_network.png') background-size: cover class: center, bottom, inverse animated fadeIn --- background-image: url('assets/images/tidyverse_data_lifecycle_map.png') background-size: cover class: center, bottom, inverse animated fadeIn --- background-image: url('assets/images/lifecycle_focus.png') background-size: cover class: center, bottom, inverse animated fadeIn --- background-image: url('assets/images/what_todays_focus_will_be.png') background-size: cover class: center, bottom, inverse animated animate fadeIn --- background-image: url('assets/images/marie_kondo.png') background-size: cover class: center, bottom, inverse animated fadeIn --- # The Basics of a ggplot2 chart .pull-left[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas aes(cut, fill = cut) + # Define How The Data Gets Mapped geom_bar(show.legend = FALSE) + # Define What Kind of Chart to Draw labs( x = "Cut", # Specify X-Axis Label y = "Count", # Specify Y-Axis Label title = "A Fancy diamonds Plot", # Specify Title Label subtitle = "A compelling subtitle", # Specify Subtitle Label caption = "Source: ggplot2 package | Data: 'diamonds'" # Specify Source/Caption Label ) + theme_lbj() + # Add The CTP Theme theme(plot.title = element_text(color = "#bf5700")) + # Modify Title's Color ggthemes::scale_fill_tableau(palette = "Color Blind") # Add A Color Blind Friendly Palette ``` ] .pull-right[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r *library(ggplot2) ``` ] .panel2-fancy_diamonds-auto[ ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) *library(ggthemes) ``` ] .panel2-fancy_diamonds-auto[ ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) *diamonds # Call On Your Dataset ``` ] .panel2-fancy_diamonds-auto[ ``` # A tibble: 53,940 × 10 carat cut color clarity depth table price x y z <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl> 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 4 0.29 Premium I VS2 62.4 58 334 4.2 4.23 2.63 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48 7 0.24 Very Good I VVS1 62.3 57 336 3.95 3.98 2.47 8 0.26 Very Good H SI1 61.9 55 337 4.07 4.11 2.53 9 0.22 Fair E VS2 65.1 61 337 3.87 3.78 2.49 10 0.23 Very Good H VS1 59.4 61 338 4 4.05 2.39 # … with 53,930 more rows ``` ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset * ggplot() # Draw A Canvas ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_04_output-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas * aes(cut, fill = cut) # Define How The Data Gets Mapped ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_05_output-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas aes(cut, fill = cut) + # Define How The Data Gets Mapped * geom_bar(show.legend = FALSE) # Define What Kind of Chart to Draw ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_06_output-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas aes(cut, fill = cut) + # Define How The Data Gets Mapped geom_bar(show.legend = FALSE) + # Define What Kind of Chart to Draw * labs( * x = "Cut", # Specify X-Axis Label * y = "Count", # Specify Y-Axis Label * title = "A Fancy diamonds Plot", # Specify Title Label * subtitle = "A compelling subtitle", # Specify Subtitle Label * caption = "Source: ggplot2 package | Data: 'diamonds'" # Specify Source/Caption Label * ) ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_07_output-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas aes(cut, fill = cut) + # Define How The Data Gets Mapped geom_bar(show.legend = FALSE) + # Define What Kind of Chart to Draw labs( x = "Cut", # Specify X-Axis Label y = "Count", # Specify Y-Axis Label title = "A Fancy diamonds Plot", # Specify Title Label subtitle = "A compelling subtitle", # Specify Subtitle Label caption = "Source: ggplot2 package | Data: 'diamonds'" # Specify Source/Caption Label ) + * theme_lbj() # Add The CTP Theme ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_08_output-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas aes(cut, fill = cut) + # Define How The Data Gets Mapped geom_bar(show.legend = FALSE) + # Define What Kind of Chart to Draw labs( x = "Cut", # Specify X-Axis Label y = "Count", # Specify Y-Axis Label title = "A Fancy diamonds Plot", # Specify Title Label subtitle = "A compelling subtitle", # Specify Subtitle Label caption = "Source: ggplot2 package | Data: 'diamonds'" # Specify Source/Caption Label ) + theme_lbj() + # Add The CTP Theme * theme(plot.title = element_text(color = "#bf5700")) # Modify Title's Color ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_09_output-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas aes(cut, fill = cut) + # Define How The Data Gets Mapped geom_bar(show.legend = FALSE) + # Define What Kind of Chart to Draw labs( x = "Cut", # Specify X-Axis Label y = "Count", # Specify Y-Axis Label title = "A Fancy diamonds Plot", # Specify Title Label subtitle = "A compelling subtitle", # Specify Subtitle Label caption = "Source: ggplot2 package | Data: 'diamonds'" # Specify Source/Caption Label ) + theme_lbj() + # Add The CTP Theme theme(plot.title = element_text(color = "#bf5700")) + # Modify Title's Color * scale_fill_tableau(palette = "Color Blind") # Add A Color Blind Friendly Palette ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_10_output-1.png" width="504" /> ] --- count: false # A step by step view .panel1-fancy_diamonds-auto[ ```r library(ggplot2) library(ggthemes) diamonds %>% # Call On Your Dataset ggplot() + # Draw A Canvas aes(cut, fill = cut) + # Define How The Data Gets Mapped geom_bar(show.legend = FALSE) + # Define What Kind of Chart to Draw labs( x = "Cut", # Specify X-Axis Label y = "Count", # Specify Y-Axis Label title = "A Fancy diamonds Plot", # Specify Title Label subtitle = "A compelling subtitle", # Specify Subtitle Label caption = "Source: ggplot2 package | Data: 'diamonds'" # Specify Source/Caption Label ) + theme_lbj() + # Add The CTP Theme theme(plot.title = element_text(color = "#bf5700")) + # Modify Title's Color scale_fill_tableau(palette = "Color Blind") # Add A Color Blind Friendly Palette ``` ] .panel2-fancy_diamonds-auto[ <img src="lbj_r_skillshare_files/figure-html/fancy_diamonds_auto_11_output-1.png" width="504" /> ] <style> .panel1-fancy_diamonds-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-fancy_diamonds-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-fancy_diamonds-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # You Can Also Make Tables .pull-left[ ```r library(gt) diamonds %>% group_by(cut) %>% summarise(mean_price=mean(price), median_price=median(price)) %>% gt() %>% # Pass The Data To A gt table tab_header(title = md("**Average Diamond Prices, by Cut**"), subtitle = "Averages calculated are the mean + median") %>% tab_source_note( source_note = "Data: diamonds | Source: ggplot2 package" ) %>% fmt_currency(currency = "USD", columns=c("mean_price", "median_price")) %>% cols_label(cut=md("**Cut**"), mean_price=md("**Mean Price**"), median_price=md("**Median Price")) ``` ] .pull-right[
Average Diamond Prices, by Cut
Averages calculated are the mean + median
Cut
Mean Price
Median Price
Fair
$4,358.76
$3,282.00
Good
$3,928.86
$3,050.50
Very Good
$3,981.76
$2,648.00
Premium
$4,584.26
$3,185.00
Ideal
$3,457.54
$1,810.00
Data: diamonds | Source: ggplot2 package
] --- # Spatial Work in R .pull-left[ ```r library(tidycensus) racevars <- c(White = "P005003", Black = "P005004", Asian = "P005006", Hispanic = "P004003") harris <- get_decennial(geography = "tract", variables = racevars, state = "TX", county = "Harris County", geometry = TRUE, summary_var = "P001001") harris %>% mutate(pct = 100 * (value / summary_value)) %>% ggplot(aes(fill = pct)) + facet_wrap(~variable) + geom_sf(color = NA) + coord_sf(crs = 26915, datum=NA) + scale_fill_viridis_c() + theme_lbj() + labs(title="Demographics in Harris County", subtitle="US Census Bureau | 2018 ACS 5-Yr Series") ``` ] .pull-right[ <img src="lbj_r_skillshare_files/figure-html/unnamed-chunk-6-1.png" width="504" /> ] --- # Interactive Charts in R
--- # Load Packages ```r options(htmltools.dir.version = FALSE) library(tidyverse) # What loads the core set of Modern R Packages, like ggplot2 library(tidycensus) # Tidyverse-friendly connection to Census API library(janitor) # Really useful functions for cleaning data library(lubridate) # Makes Working With Dates Pretty Easy library(ggtext) # Incredibl text features for ggplot2 objects library(cowplot) # Helps with arranging and layering of ggplot2 objects readRenviron("~/.Renviron") census_api_key <- Sys.getenv("CENSUS_API_KEY") ``` * [tidyverse](https://www.tidyverse.org) * [tidycensus](https://walker-data.com/tidycensus/) * [janitor](http://sfirke.github.io/janitor/) * [lubridate](https://lubridate.tidyverse.org) * [ggtext](https://wilkelab.org/ggtext/) * [cowplot](https://wilkelab.org/cowplot/index.html) --- background-image: url('assets/images/wayne_detroit_cases_per.png') background-size: cover class: center, bottom, inverse animated fadeIn --- background-image: url('assets/images/wayne_detroit_case_fatality_rate.png') background-size: cover class: center, bottom, inverse animated animate fadeIn --- # Xaringan, xaringanExtra, + flipbookr These slides are mostly made with three R packages called [`xaringan`](https://github.com/yihui/xaringan), [`xaringanExtra`](https://pkg.garrickadenbuie.com/xaringanExtra/#/), and [`flipbookr`](https://github.com/EvaMaeRey/flipbookr) (which generates the step-by-step code and organizes it alongside the output). I'm not really going to go into all the things each of these packages can do, but I hope the slides speak to how great the packages are! If you have R + RStudio installed on your laptop, you can install these packages directly from Github by pasting this in your RStudio console: ```r # install.packages("devtools") devtools::install_github("yihui/xaringan") devtools::install_github("gadenbuie/xaringanExtra") devtools::install_github("EvaMaeRey/flipbookr") ``` Also, the interactive chart was made with one of my favorite R packages called [`highcharter`](https://jkunst.com/highcharter/) which is an amazing R wrapper for the highcharts.js library.