tidyverse

Datasaurus Dozen

The datasaurus dozen The datasaurus sozen is a fantastic teaching resource for examining the importance of data visualization. Let’s have a look. datasaurus <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-10-13/datasaurus.csv') ## ## ── Column specification ──────────────────────────────────────────────────────── ## cols( ## dataset = col_character(), ## x = col_double(), ## y = col_double() ## ) Two libraries to make our work easy. library(tidyverse) library(skimr) First, the summary statistics. datasaurus %>% group_by(dataset) %>% skim() Table 1: Data summary Name Piped data Number of rows 1846 Number of columns 3 _______________________ Column type frequency: numeric 2 ________________________ Group variables dataset Variable type: numeric

Spending on Kids

Spending on Kids First, let me import the data. kids <- read.csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-15/kids.csv') # kids <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-15/kids.csv') Now let me summarise it and show a table of the variables. summary(kids) ## state variable year raw ## Length:23460 Length:23460 Min. :1997 Min. : -60139 ## Class :character Class :character 1st Qu.:2002 1st Qu.: 71985 ## Mode :character Mode :character Median :2006 Median : 252002 ## Mean :2006 Mean : 1181359 ## 3rd Qu.

Cocktails

The Data cocktails <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-26/cocktails.csv') ## ## ── Column specification ──────────────────────────────────────────────────────── ## cols( ## row_id = col_double(), ## drink = col_character(), ## date_modified = col_datetime(format = ""), ## id_drink = col_double(), ## alcoholic = col_character(), ## category = col_character(), ## drink_thumb = col_character(), ## glass = col_character(), ## iba = col_character(), ## video = col_logical(), ## ingredient_number = col_double(), ## ingredient = col_character(), ## measure = col_character() ## ) boston_cocktails <- readr::read_csv('https://raw.

A Quick tidyTuesday on Beer, Breweries, and Ingredients

Beer Distribution The #tidyTuesday for March 31, 2020 is on beer. The essential elements and a method for pulling the data are shown: Imgur A Comment on Scraping .pdf The Tweet The details on how the data were obtained are a nice overview of scraping .pdf files. The code for doing it is at the bottom of the page. @thomasmock has done a great job commenting his way through it.

COVID-19 Scraping

NB: This was last updated on March 25, 2020. Building Oregon COVID data I have a few days of data now. To rebuild it, I will have to use the waybackmachine. The files that I need to locate and follow updates to this page from Oregon’s OHA. A Scraper Let me explain the logic for the scraper. NB: I had to rewrite it; the original versions of the website had three tables without data on hospitalizations.

Visualising COVID-19 in Oregon

Oregon COVID data I now have a few days of data. These data are current as of March 24, 2020. I will present the first version of these visualizations here and then move the auto-update to a different location. A messy first version of the scraping exercise is at the bottom of this post. paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData") ## [1] "https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID2020-03-24.RData" load(url(paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData"))) A base map Load the tigris library then grab the map as an sf object; there is a geom_sf that makes them easy to work with.

Tracking COVID-19 2020-03-24

R to Import COVID Data library(tidyverse) library(gganimate) COVID.states <- read.csv(url("http://covidtracking.com/api/states/daily.csv")) COVID.states <- COVID.states %>% mutate(Date = as.Date(as.character(date), format = "%Y%m%d")) The Raw Testing Incidence I want to use patchwork to show the testing rate by state in the United States. Then I want to show where things currently stand. In both cases, a base-10 log is used on the number of tests.

The Carbon Footprint of Food Produced for Consumption

tidyTuesday on the Carbon Footprint of Feeding the Planet The tidyTuesday for this week relies on data scraped from the Food and Agricultural Organization of the United Nations. The blog post for obtaining the data can be found on r-tastic. The scraping exercise is nice and easy to follow and explored a case of cleaning up a very messy data structure. I took this exercise as practice for using pivot_wider and pivot_longer.

Mapping San Francisco Trees

Trees in San Francisco This week’s data cover trees in San Francisco. sf_trees <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-28/sf_trees.csv') library(tidyverse); library(ggmap); library(skimr) skim(sf_trees) Table 1: Data summary Name sf_trees Number of rows 192987 Number of columns 12 _______________________ Column type frequency: character 6 Date 1 numeric 5 ________________________ Group variables None Variable type: character skim_variable n_missing complete_rate min max empty n_unique whitespace legal_status 54 1.

Simple Point Maps in R

Mapping Points in R My goal is a streamlined and self-contained freeware map maker with points denoting addresses. It is a three step process that involves: Get a map. Geocode the addresses into latitude and longitude. Combine the the two with a first map layer and a second layer on top that contains the points. From there, it is pretty easy to get fancy using ggplotly to put relevant text hovers into place.