plot

New York Times Data on COVID

New York Times data for the US The New York Times has a wonderful compilation of United States on the novel coronavirus. The data update automatically so the following graphics were generated with data retrieved at 2020-06-04 13:22:37. The Basic State of Things options(scipen=9) library(tidyverse); library(hrbrthemes); library(patchwork); library(plotly); library(ggdark); library(ggrepel) CTP <- read.csv("https://covidtracking.com/api/v1/states/daily.csv") state.data <- read_csv(url("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv")) Rect.NYT <- complete(state.data, state,date) Rect.NYT <- Rect.NYT %>% group_by(state) %>% mutate(New.

GDPR Violations

R Markdown I love this intro photo from the tidyTuesday page. This week’s tidyTuesday data cover violations of the GDPR. gdpr_violations <- readr::read_tsv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-04-21/gdpr_violations.tsv') ## Parsed with column specification: ## cols( ## id = col_double(), ## picture = col_character(), ## name = col_character(), ## price = col_double(), ## authority = col_character(), ## date = col_character(), ## controller = col_character(), ## article_violated = col_character(), ## type = col_character(), ## source = col_character(), ## summary = col_character() ## ) gdpr_text <- readr::read_tsv('https://raw.

A GeoFacet of Credit Quality

In previous work with Skip Krueger, we conceptualized bond ratings as a multiple rater problem and extracted measure of state level creditworthiness. I had always had it on my list to do something like this and recently ran across a package called geofacet that makes it simply to easy to do. So here goes. The code is below the post. library(haven) library(dplyr) Pew.Data <- read_dta(url("https://github.com/robertwwalker/academic-mymod/raw/master/data/Pew/modeledforprediction.dta")) library(tidyverse) load(url("https://github.com/robertwwalker/academic-mymod/raw/master/data/Pew/Scaled-BR-Pew.RData")) state.ratings <- data.

Quick and Dirty Fredr

Some Data from FREDr Downloading the FRED data on national debt as a percentage of GDP. I first want to examine the US data and will then turn to some comparisons. fredr makes it markable asy to do! I will use two core tools from fredr. First, fredr_series_search allows one to enter search text and retrieve the responsive series given that search text. They can be sorted in particular ways, two such options are shown below.

R for Driving Directions?

Driving Directions from R There is no reason that maps with driving directions cannot be produced in R. Given the directions api from Google, it should be doable. As it happens, I was surprised how easy it was. Let me try to map a simple A to B location. First, to the locations; I will specify two. It is possible to geolocate addresses for this also, I happened to have the GPS coordinates in hand.

Generative aRt

mathart A cool package for math generated art that I just discovered. Here is the install code for it install.packages(c("devtools", "mapproj", "tidyverse", "ggforce", "Rcpp")) devtools::install_github("marcusvolz/mathart") devtools::install_github("marcusvolz/ggart") devtools::install_github("gsimchoni/kandinsky") Load some libraries library(mathart) library(ggart) library(ggforce) library(Rcpp) library(tidyverse) Generate some Art? This is quite fun to do. set.seed(12341) terminals <- data.frame(x = runif(10, 0, 10000), y = runif(10, 0, 10000)) df <- 1:10000 %>% map_df(~weiszfeld(terminals, c(points$x[.], points$y[.])), .id = "id") p <- ggplot() + geom_point(aes(x, y), points, size = 1, alpha = 0.

The Economist's Visualization Errors

The Economist’s Errors and Credit Where Credit is Due The Economist is serious about their use of data visualization and they have occasionally owned up to errors in their visualizations. They can be deceptive, uninformative, confusing, excessively busy, and present a host of other barriers to clean communication. Their blog post on their errors is great. I have drawn the following example from a #tidyTuesday earlier this year that explores this.

nflscrapR is amazing

Scraping NFL data Note: An original version of this post had issues induced by overtime games. There is a better way to handle all of this that I learned from a brief analysis of a tie game between Cleveland and Pittsburgh in Week One. The nflscrapR package is designed to make data on NFL games more easily available. To install the package, we need to grab it from github.

Stocks and gganimate

tidyquant Automates a lot of equity research and calculation using tidy concepts. Here, I will first use it to get the components of the S and P 500 and pick out those with weights over 1.25 percent. In the next step, I download the data and finally calculate daily returns and a cumulative wealth index. library(tidyquant) library(tidyverse) tq_index("SP500") %>% filter(weight > 0.0125) %>% select(symbol,company) -> Tickers Tickers <- Tickers %>% filter(symbol!

Black Boxes: A Gender Gap Example

Variance in the Outcome: The Black Box Regression models engage an exercise in variance accounting. How much of the outcome is explained by the inputs, individually (slope divided by standard error is t) and collectively (Average explained/Average unexplained with averaging over degrees of freedom is F). This, of course, assumes normal errors. This document provides a function for making use of the black box. Just as in common parlance, a black box is the unexplained.