R to Import COVID Data library(tidyverse) library(gganimate) COVID.states <- read.csv(url("http://covidtracking.com/api/states/daily.csv")) COVID.states <- COVID.states %>% mutate(Date = as.Date(as.character(date), format = "%Y%m%d")) The Raw Testing Incidence I want to use patchwork to show the testing rate by state in the United States. Then I want to show where things currently stand. In both cases, a base-10 log is used on the number of tests.
tidyTuesday: December 10, 2019 Replicating plots from simplystatistics. One nice twist is the development of a tidytuesdayR package to grab the necessary data in an easy way. You can install the package via github. I will also use fiftystater and ggflags.
devtools::install_github("thebioengineer/tidytuesdayR") devtools::install_github("ellisp/ggflags") devtools::install_github("wmurphyrd/fiftystater") tuesdata <- tidytuesdayR::tt_load(2019, week = 50) ## --- Downloading #TidyTuesday Information for 2019-12-10 ---- ## --- Identified 4 files available for download ---- ## --- Downloading files --- ## Warning in identify_delim(temp_file): Not able to detect delimiter for the file.
Philadelphia Map Use ggmap for the base layer.
library(ggmap); library(osmdata); library(tidyverse) PHI <- get_map(getbb("Philadelphia, PA"), maptype = "stamen", zoom=12) Get the Tickets Data TidyTuesday covers 1.26 million parking tickets in Philadelphia.
tickets <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-12-03/tickets.csv") ## Parsed with column specification: ## cols( ## violation_desc = col_character(), ## issue_datetime = col_datetime(format = ""), ## fine = col_double(), ## issuing_agency = col_character(), ## lat = col_double(), ## lon = col_double(), ## zip_code = col_double() ## ) Two Lines of Code Left library(lubridate); library(ggthemes) tickets <- tickets %>% mutate(Day = wday(issue_datetime, label=TRUE)) # use lubridate to extract the day of the month.
Searching and Mapping the Census Searching for the Asian Population via the Census To use tidycensus, there are limitations imposed by the available tables. There is ACS – a survey of about 3 million people – and the two main decennial census files [SF1] and [SF2]. I will search SF1 for the Asian population.
library(tidycensus); library(kableExtra) library(tidyverse); library(stringr) v10 <- load_variables(2010, "sf1", cache = TRUE) v10 %>% filter(str_detect(concept, "ASIAN")) %>% filter(str_detect(label, "Female")) %>% kable() %>% scroll_box(width = "100%") name label concept P012D026 Total!
Fariss Data Is neat and complete.
load("FarissHRData.RData") skimr::skim(HR.Data) Table 1: Data summary Name HR.Data Number of rows 11717 Number of columns 27 _______________________ Column type frequency: factor 1 numeric 26 ________________________ Group variables None Variable type: factor
skim_variable n_missing complete_rate ordered n_unique top_counts COW_YEAR 0 1 FALSE 11717 100: 1, 100: 1, 100: 1, 100: 1 Variable type: numeric
Scraping NFL data Note: An original version of this post had issues induced by overtime games. There is a better way to handle all of this that I learned from a brief analysis of a tie game between Cleveland and Pittsburgh in Week One.
The nflscrapR package is designed to make data on NFL games more easily available. To install the package, we need to grab it from github.
Archigos Is an amazing collaboration that produced a comprehensive dataset of world leaders going pretty far back; see Archigos on the web. For thinking about leadership, it is quite natural. In this post, I want to do some reshaping into country year and leader year datasets and explore the basic confines of Archigos. I also want to use gganimate for a few things. So what do we know?
library(lubridate) library(tidyverse) library(ggthemes) library(stringr) library(gganimate) library(emoGG) library(emojifont) library(haven) Archigos <- read_dta(url("http://www.