Oregon COVID data
I wanted to create a self-updating visualization of the data on COVID-19 in the state of Oregon provided by OHA. I still have yet to do that but decided to build this one to visualize the New York Times data.
There is a separate page of daily maps. Oregon reports a set of daily snapshots while progression requires ingesting new data each day so I began tracking it March 20; the process of scraping it is detailed in a separate file.
New York Times data for the US
The New York Times has a wonderful compilation of United States on the novel coronavirus. The data update automatically so the following graphics were generated with data retrieved at 2020-11-30 16:51:46.
The Basic State of Things
options(scipen=9)
library(tidyverse); library(hrbrthemes); library(patchwork); library(plotly); library(ggdark); library(ggrepel); library(lubridate)
CTP <- read.csv("https://covidtracking.com/api/v1/states/daily.csv")
state.data <- read_csv(url("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv"))
Rect.NYT <- complete(state.data, state,date)
# Create new cases and new deaths
Rect.
Beer Distribution
The #tidyTuesday for March 31, 2020 is on beer. The essential elements and a method for pulling the data are shown:
Imgur
A Comment on Scraping .pdf
The Tweet
The details on how the data were obtained are a nice overview of scraping .pdf files. The code for doing it is at the bottom of the page. @thomasmock has done a great job commenting his way through it.
R to Import COVID Data
library(tidyverse)
library(gganimate)
COVID.states <- read.csv(url("http://covidtracking.com/api/states/daily.csv"))
COVID.states <- COVID.states %>% mutate(Date = as.Date(as.character(date), format = "%Y%m%d"))
The Raw Testing Incidence
I want to use patchwork to show the testing rate by state in the United States. Then I want to show where things currently stand. In both cases, a base-10 log is used on the number of tests.
tidyTuesday: December 10, 2019
Replicating plots from simplystatistics. One nice twist is the development of a tidytuesdayR package to grab the necessary data in an easy way. You can install the package via github. I will also use fiftystater and ggflags.
devtools::install_github("thebioengineer/tidytuesdayR")
devtools::install_github("ellisp/ggflags")
devtools::install_github("wmurphyrd/fiftystater")
tuesdata <- tidytuesdayR::tt_load(2019, week = 50)
## --- Downloading #TidyTuesday Information for 2019-12-10 ----
## --- Identified 4 files available for download ----
## --- Downloading files ---
## Warning in identify_delim(temp_file): Not able to detect delimiter for the file.
Fariss Data
Is neat and complete.
load("FarissHRData.RData")
skimr::skim(HR.Data)
Table 1: Data summary
Name
HR.Data
Number of rows
11717
Number of columns
27
_______________________
Column type frequency:
factor
1
numeric
26
________________________
Group variables
None
Variable type: factor
skim_variable
n_missing
complete_rate
ordered
n_unique
top_counts
COW_YEAR
0
1
FALSE
11717
100: 1, 100: 1, 100: 1, 100: 1
Variable type: numeric
Archigos
Is an amazing collaboration that produced a comprehensive dataset of world leaders going pretty far back; see Archigos on the web. For thinking about leadership, it is quite natural. In this post, I want to do some reshaping into country year and leader year datasets and explore the basic confines of Archigos. I also want to use gganimate for a few things. So what do we know?