My Google Scholar page | A PDF of my CV

Papers & Posters

Machine Learning and Causal Inference: A Modular Approach to Assessing the Effects of the London Bombings of 2005 (with Jake Bowers, Ben B. Hansen, Costas Panagopoulos)

The design of a randomized study guarantees not only clear and “interpretable comparisons” (Kinder and Palfrey, 1993, page 7) but valid statistical tests even in the absence of large samples or known data generating processes for outcomes (Fisher, 1935, Chap 2). Yet, while design alone yields valid tests the tests could lack power: a valid but wide confidence interval may be more useful than a misleadingly narrow confidence interval, but still shed little light on the theory motivating the study. After a brief demonstration of Fisher’s statistical framework, we show a method by which a researcher may use substantive background knowledge about outcomes in order to increase the power of her statistical tests. Combining substance and design in this particular way enables valid and powerful tests. We combine modern methods of machine learning with Fisher’s conceptual framework and survey sampling based design-based statistical inference originating with Neyman in order to maximize power without compromising the integrity of the resulting statistical inference. We apply our ideas in the context of a natural experiment created by the London subway bombings of 2005.

A more powerful test statistic for reasoning about interference between units (with Jake Bowers and Peter M. Aronow).

Bowers, Fredrickson and Panagopoulos (2013) showed that one could use Fisher’s randomization-based hypothesis testing framework to assess counterfactual causal models of treatment propagation and spillover across social networks. This research note improves the statistical infernce presented in Bowers, Fredrickson and Panagopoulos (2013) (henceforth BFP) by substituting a test statistic based on a sum of squared residuals and incorporating information about the fixed network for the simple Kolmogorov-Smirnov test statistic (Hollander, 1999, x 5.4) they used. This note incrementally improves the application of BFP’s “reasoning about interference” approach. We do not offer general results about test statistics for multi-parameter causal models on social networks here, but instead hope to stimulate further, and deeper, work on test statistics and sharp hypothesis testing

Reasoning about Interference Between Units (with Jake Bowers and Costas Panagopoulos). Political Analysis (2013) 21(1): 97 – 124. (Winner of the 2014 Miller Prize for best work appearing in Political Analysis in the preceding year.)

If an experimental treatment is experienced by both treated and control group units, tests of hypotheses about causal effects may be difficult to conceptualize let alone execute. In this paper, we show how counterfactual causal models may be written and tested when theories suggest spillover or other network-based interference among experimental units. We show that the “no interference” assumption need not constrain scholars who have interesting questions about interference. We offer researchers the ability to model theories about how treatment given to some units may come to influence outcomes for other units. We further show how to test hypotheses about these causal effects, and we provide tools to enable researchers to assess the operating characteristics of their tests given their own models, designs, test statistics, and data. The conceptual and methodological framework we develop here is particularly applicable to social networks, but may be usefully deployed whenever a researcher wonders about interference between units. Interference between units need not be an untestable assumption; instead, interference is an opportunity to ask meaningful questions about theoretically interesting phenomena.

RItools: Flexible randomization inference in R (PDF)

Demonstrates the use of the RItools package for R to create and test hypotheses using randomization inference. Users can describe the randomization procedure, their model of effects, and test statistics. The function RItest provides confidence intervals an p-values. The paper provides background on randomization inference and applied examples using RItools.

Returning to the Cradle of Democracy: Citizen responses under election and sortition. (PDF)

The hallmark of modern democracies is the competitive election. This institution is seen as the primary connection between leaders and the population. This has not always been the case. Sortition, the random selection of leaders from the population, served as the primary institution of democracy in ancient Athens. How would citizens in a modern democracy react to the use of sortition to select leaders? This study employs a survey experiment in which subjects read about a local development grant, overseen by either an elected or randomly selected committee. I find that sortition encourages more citizens to seek leadership positions, though other forms of participation remain unchanged. I also find that despite a stated preference for election, subjects see the two committees as equally capable and responsible, even when confronted with corrupt acts and closed door meetings.

Using Field Experiments to Understand Information as an Antidote to Corruption (with Matthew S. Winters and Paul Testa) in Danila Serra, Leonard Wantchekon (ed.) New Advances in Experimental Research on Corruption (Research in Experimental Economics, Volume 15) Emerald Group Publishing Limited, pp.213 - 246

In observational data, access to information is associated with lower levels of corruption. This chapter reviews a small but growing body of work that uses field experiments to explore the mechanisms behind this relationship. We present a typology for understanding this research based on the type of corruption being addressed (political vs. bureaucratic), the mechanism for accountability (retrospective vs. prospective), and the nature of the information provided (factual vs. prescriptive). We describe some of the tradeoffs involved in design decisions for such experiments and suggest directions for future research.

Collaboration for Social Scientists (with Paul F. Testa and Nils B. Weidmann) The Poltical Methodologist (2011) 18(2).

In this article, we consider how to improve two different modes of collaboration: synchronous and asynchronous. When working synchronously, contributors focus on the same portions of the research at the same time. Of course, virtually any research project will require collaborators to spend time working on either different portions of the project or working on the same sections but at different times. We label this form of collaboration asynchronous. Asynchronous collaboration requires more careful attention to dividing labor, and we spend more time providing software solutions in this domain. These suggestions are based on what has worked for us. These suggestions are grounded in experience, and we think they are useful techniques for any team to adopt. We have also found that software is the easy part of any collaboration while the personal and intellectual parts of collaboration are both more difficult and more fulfilling than playing with software tools. Hopefully, adopting some of these techniques may help your team get past technical details faster and down to the real business of producing research.