R

Trying out Leaflet

International Murders Are among the data for analysis in the tidyTuesday for December 10, 2019. These are made for a map. library(tidyverse) library(leaflet) library(stringr) library(sf) library(here) library(widgetframe) library(htmlwidgets) library(htmltools) options(digits = 3) set.seed(1234) theme_set(theme_minimal()) library(tidytuesdayR) tuesdata <- tt_load(2019, week = 50) murders <- tuesdata$gun_murders There isn’t much data so it should make this a bit easier. Now for some data. As it happens, the best way I currently know how to do this is going to involve acquiring a spatial frame.

Philadelphia Parking Tickets: a tidyTuesday

Philadelphia Map Use ggmap for the base layer. library(ggmap); library(osmdata); library(tidyverse) PHI <- get_map(getbb("Philadelphia, PA"), maptype = "stamen", zoom=12) Get the Tickets Data TidyTuesday covers 1.26 million parking tickets in Philadelphia. tickets <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-12-03/tickets.csv") ## Parsed with column specification: ## cols( ## violation_desc = col_character(), ## issue_datetime = col_datetime(format = ""), ## fine = col_double(), ## issuing_agency = col_character(), ## lat = col_double(), ## lon = col_double(), ## zip_code = col_double() ## ) Two Lines of Code Left library(lubridate); library(ggthemes) tickets <- tickets %>% mutate(Day = wday(issue_datetime, label=TRUE)) # use lubridate to extract the day of the month.

US Census Mapping

Searching and Mapping the Census Searching for the Asian Population via the Census To use tidycensus, there are limitations imposed by the available tables. There is ACS – a survey of about 3 million people – and the two main decennial census files [SF1] and [SF2]. I will search SF1 for the Asian population. library(tidycensus); library(kableExtra) library(tidyverse); library(stringr) v10 <- load_variables(2010, "sf1", cache = TRUE) v10 %>% filter(str_detect(concept, "ASIAN")) %>% filter(str_detect(label, "Female")) %>% kable() %>% scroll_box(width = "100%") name label concept P012D026 Total!

Fariss Human Rights Data with Animation

Fariss Data Is neat and complete. load("FarissHRData.RData") skimr::skim(HR.Data) Table 1: Data summary Name HR.Data Number of rows 11717 Number of columns 27 _______________________ Column type frequency: factor 1 numeric 26 ________________________ Group variables None Variable type: factor skim_variable n_missing complete_rate ordered n_unique top_counts COW_YEAR 0 1 FALSE 11717 100: 1, 100: 1, 100: 1, 100: 1 Variable type: numeric

Generative aRt

mathart A cool package for math generated art that I just discovered. Here is the install code for it install.packages(c("devtools", "mapproj", "tidyverse", "ggforce", "Rcpp")) devtools::install_github("marcusvolz/mathart") devtools::install_github("marcusvolz/ggart") devtools::install_github("gsimchoni/kandinsky") Load some libraries library(mathart) library(ggart) library(ggforce) library(Rcpp) library(tidyverse) Generate some Art? This is quite fun to do. set.seed(12341) terminals <- data.frame(x = runif(10, 0, 10000), y = runif(10, 0, 10000)) df <- 1:10000 %>% map_df(~weiszfeld(terminals, c(points$x[.], points$y[.])), .id = "id") p <- ggplot() + geom_point(aes(x, y), points, size = 1, alpha = 0.

Simple Oregon County Mapping

Some Data for the Map I want to get some data to place on the map. I found a website with population and population change data for Oregon in .csv format. I cannot direct download it from R, instead I have to button download it and import it. library(tidyverse) ## ── Attaching packages ────────────────────────── tidyverse 1.3.0 ── ## ✓ ggplot2 3.2.1 ✓ purrr 0.3.3 ## ✓ tibble 2.1.3 ✓ dplyr 0.

The Economist's Visualization Errors

The Economist’s Errors and Credit Where Credit is Due The Economist is serious about their use of data visualization and they have occasionally owned up to errors in their visualizations. They can be deceptive, uninformative, confusing, excessively busy, and present a host of other barriers to clean communication. Their blog post on their errors is great. I have drawn the following example from a #tidyTuesday earlier this year that explores this.

tidyTuesday does Pizza

Pizza Ratings The #tidyTuesday for this week involves pizza shop ratings data. Let’s see what we have. pizza_jared <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-10-01/pizza_jared.csv") ## Parsed with column specification: ## cols( ## polla_qid = col_double(), ## answer = col_character(), ## votes = col_double(), ## pollq_id = col_double(), ## question = col_character(), ## place = col_character(), ## time = col_double(), ## total_votes = col_double(), ## percent = col_double() ## ) pizza_barstool <- readr::read_csv("https://raw.

Some Basic Text on the Mueller Report

So this Robert Mueller guy wrote a report I may as well analyse it a bit. First, let me see if I can get a hold of the data. I grabbed the report directly from the Department of Justice website. You can follow this link. library(tidyverse) library(pdftools) # Download report from link above mueller_report_txt <- pdf_text("../data/report.pdf") # Create a tibble of the text with line numbers and pages mueller_report <- tibble( page = 1:length(mueller_report_txt), text = mueller_report_txt) %>% separate_rows(text, sep = "\n") %>% group_by(page) %>% mutate(line = row_number()) %>% ungroup() %>% select(page, line, text) write_csv(mueller_report, "data/mueller_report.

nflscrapR is amazing

Scraping NFL data Note: An original version of this post had issues induced by overtime games. There is a better way to handle all of this that I learned from a brief analysis of a tie game between Cleveland and Pittsburgh in Week One. The nflscrapR package is designed to make data on NFL games more easily available. To install the package, we need to grab it from github.