I am an Associate Professor of Quantitative Methods in the Atkinson Graduate School of Management at Willamette University. My research interests include panel/cross-sectional time series data, causal inference in observed populations (joint with Tim Johnson), political economy, and general applied statistics and statistical computing. I am also an Honorary Instructor at the University of Essex where I lecture annually in the Essex Summer School in Social Science Data Analysis. I have held previous appointments at Dartmouth College, Harvard University, Texas A&M University, Washington University in Saint Louis, and Rice University. With Curt Signorino and Muhammet Bas, I was awarded the Warren Miller Prize for Statistical Backwards Induction for the best article in Political Analysis.
Of greatest import, I am married to the love of my life, am the proud father of two wonderful sons, like the Pacific Northwest, and, to keep things in balance, am a lifelong Arsenal fan.
PhD in Political Science, 2005
University of Rochester
MA in Political Science, 2002
Universty of Rochester
BA in Post-Soviet and East European Studies, 1995
University of Texas at Austin
Страноведение России, 1994
Московский государственный лингвистический университет
The Johns Hopkins dashboard This is what Johns Hopkins has provided as a dashboard using ARCGIS. They have essentially layered out the data into national and subnational data and then used the arcgis dashboard to cycle through them. The data library(tidyverse); library(sf); library(rnaturalearth); library(rnaturalearthdata); library(ggmap) March26JH <- read_csv(url("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/03-26-2020.csv")) A Leaflet Johns Hopkins also has an extensive collection of data. Let’s see what one set looks like.
Oregon COVID data I now have a few days of data. These data are current as of March 24, 2020. I will present the first version of these visualizations here and then move the auto-update to a different location. A messy first version of the scraping exercise is at the bottom of this post. paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData") ##  "https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID2020-03-24.RData" load(url(paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData"))) A base map Load the tigris library then grab the map as an sf object; there is a geom_sf that makes them easy to work with.
Oregon COVID data The Oregon data are available from OHA here. I cut and pasted the first two days because it was easy with datapasta. As it goes on, it was easier to write a script that I detail elsewhere that I can self-update. urbnmapr The Urban Institute has an excellent state and county mapping package. I want to make use of the county-level data and plot the starter map.
The Office library(tidyverse) office_ratings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-17/office_ratings.csv') A First Plot The number of episodes for the Office by season. library(janitor) TableS <- office_ratings %>% tabyl(season) p1 <- TableS %>% ggplot(., aes(x=as.factor(season), y=n, fill=as.factor(season))) + geom_col() + labs(x="Season", y="Episodes", title="The Office: Episodes") + guides(fill=FALSE) p1 Ratings How are the various seasons and episodes rated? p2 <- office_ratings %>% ggplot(., aes(x=as.factor(season), y=imdb_rating, fill=as.factor(season), color=as.factor(season))) + geom_violin(alpha=0.3) + guides(fill=FALSE, color=FALSE) + labs(x="Season", y="IMDB Rating") + geom_point() p2 Patchwork Using patchwork, we can combine multiple plots.
R to Import COVID Data library(tidyverse) library(gganimate) COVID.states <- read.csv(url("http://covidtracking.com/api/states/daily.csv")) COVID.states <- COVID.states %>% mutate(Date = as.Date(as.character(date), format = "%Y%m%d")) The Raw Testing Incidence I want to use patchwork to show the testing rate by state in the United States. Then I want to show where things currently stand. In both cases, a base-10 log is used on the number of tests.
The management of the fuzzy front end (FFE) phase of innovation is crucial to the ultimate success of new product and process initiatives. A critical challenge that teams face at this stage is dealing with equivocality – the extent to which project participants grapple with multiple, and plausibly conflicting, meanings and interpretations of the information available to them (Daft and Lengel, 1986; Weick, 1979). While initially, a certain level of equivocality is beneficial for enhancing team creativity and preventing early closure, at some point it must be resolved in order for an idea to become a viable New Product Development (NPD) project. This study employs a social networks perspective to understand how different types of informal work-based relations and their structural properties affect equivocality on project teams in the FFE. In particular, it examines the structural effects of two types of social relations and their associated networks – technical-advice and friendship ties. The findings suggest that while high density in a projects technical-advice network is likely to reduce equivocality, high density in a projects friendship network is likely to increase it. More interestingly, having multiple members on projects who are highly central in the lab technical-advice network tends to increase equivocality unless it is balanced with members who occupy positions of high centrality in the lab friendship network. In addition to contributing to the scholarship on NPD, FFE, and social networks, the results offer managerial insights for deploying social networks in order to assemble NPD teams and structure the flows of communication on projects so as to resolve equivocality in the FFE.
To measure the effect of veterans’ preference on U.S. federal workforce quality, researchers have assessed whether military veterans advance in their federal careers at a different rate than nonveterans. This research, however, has produced mixed results. In research concerning recent employee cohorts, nonveterans outpace veterans’ advancement, implying that veterans’ preference lessens employee quality. In older cohorts, veterans and nonveterans advance comparably. The latter research, however, controls for employees’ entry positions, whereas research concerning recent cohorts does not do so, thus inhibiting direct comparison of results. To facilitate such comparisons, we controlled for veterans’ and nonveterans’ entry positions in a study of career advancement among all white-collar, U.S. executive branch workers entering employment from 1992 to 2013. In these recent cohorts, we find roughly equivalent rates of career advancement among veterans and nonveterans when controlling for entry positions. This finding holds when using grade or pay increases as measures of advancement.
We examine models that relax proportionality in cumulative ordered regression models. Something fundamental arising from ordered variables and stochastic ordering implies a partitioning. Efforts to relax proportionality also relax the ability to collapse an inherently multidimensional problem to a partitioning of the (unidimensional) real line. It is surprising and unfortunate to find that deviations from proportionality are sufficient to generate internal contradictions; undecidable propositions must exist by relaxing proportional odds without other relevant and significant changes in the underlying model. We prove a single theorem linking continuous support and partitions of a latent space to show that for these two characteristics to be simultaneously satisfied, the model must be the proportional-odds model. Conditioning on the adjacency that is closely related to the partitioning is fruitful, but at this point we join the class of continuation-ratio models. Alternatively, Anderson’s (1984) stereotype model is quite general and nests ordered and unordered choice models, but again we have left the domain of cumulative models. Adopting multidimensional cumulative models or imposing covariate-specific thresholds are the only certain methods for avoiding these troubles in the cumulative framework. It is generically impossible to generalize the cumulative class of ordered regression models in ways consistent with the spirit of generalized cumulative regression models. Monte Carlo studies also demonstrate the general principles.