This article describes the R package rdrobust, which provides data-driven graphical and in- ference procedures for RD designs. The package includes three main functions: rdrobust, rdbwselect and rdplot. The first function (rdrobust) implements conventional local-polynomial RD treatment effect point estimators and confidence intervals, as well as robust bias-corrected confidence intervals, for average treatment effects at the cutoff. This function covers sharp RD, sharp kink RD, fuzzy RD and fuzzy kink RD designs, among other possibilities. The second function (rdbwselect) implements several bandwidth selectors proposed in the RD literature. The third function (rdplot) provides data-driven optimal choices of evenly-spaced and quantile-spaced partition sizes, which are used to implement several data-driven RD plots. The regression-discontinuity (RD) design is a widely employed quasi-experimental research design in social, behavioral and related sciences; for reviews see Imbens and Lemieux (2008) and Lee and Lemieux (2010). In this design, units are assigned to treatment based on whether their value of an observed covariate is above or below a known cutoff, and the probability of receiving treatment conditional on this covariate jumps discontinuously at the cutoff. This jump induces "variation" in treatment assignment that may be regarded, under appropriate assumptions, as being unrelated to potential confounders. Thus, inference in RD designs is typically conducted using only observations near the cutoff or threshold, where the discontinuous change in the probability of treatment assignment occurs. Due to its local nature, RD average treatment effects estimators are usually constructed using local-polynomial nonparametric regression, and statistical inference is based on large-sample approximations. This article gives an introduction to the R package rdrobust (Calonico et al., 2015b), which offers an array of data-driven local-polynomial and partitioning-based inference procedures for RD designs. We introduce three main functions implementing several data-driven nonparametric point and con- fidence intervals estimators, bandwidth selectors, and plotting procedures useful for RD empirical applications: rdrobust(). This function implements the bias-corrected robust (to "large" bandwidth choices) inference procedure proposed by Calonico, Cattaneo, and Titiunik (2014a, CCT hereafter), as well as many other RD inference procedures employing local-polynomial regression. The function rdrobust offers bias-corrected robust confidence intervals for average treatment effects at the cutoff for sharp RD, sharp kink RD, fuzzy RD and fuzzy kink RD designs. rdbwselect(). This function implements several data-driven bandwidth selectors for RD designs based on the recent work of Imbens and Kalyanaraman (2012, IK hereafter) and CCT. Although this command may be used as a stand-alone bandwidth selector in RD applications, its main purpose is to provide fully data-driven bandwidth choices to be used by our main function rdrobust(). rdplot(). This function implements several data-driven optimal choices of evenly-spaced and quantile-spaced bins, which are useful to produce RD plots that either approximate the regression function by local sample averages or represent the overall variability of the data in a disciplined way. These optimal choices are based on an integrated mean squared error expansion of appropriately constructed partitioning estimators, as discussed in Calonico, Cattaneo, and Titiunik (in press); see also Cattaneo and Farrell (2013) for related results. These binned sample means and partition size chosen are used to construct the popular RD plots commonly found in RD applications in a fully automatic way. We first provide a brief review of all the methods implemented in rdrobust, and then discuss an empirical illustration using some of the features of our functions rdrobust(), rdbwselect() and rdplot(). A full description of the capabilities of the package rdrobust is available in its manual and help files. A companion Stata (StataCorp., 2013) package is described in Calonico, Cattaneo, and Titiunik (2014b).
[1]
Matias D. Cattaneo,et al.
Optimal Data-Driven Regression Discontinuity Plots
,
2015
.
[2]
Max H. Farrell,et al.
Optimal convergence rates, Bahadur representation, and asymptotic normality of partitioning estimators☆
,
2013
.
[3]
M. Farrell,et al.
On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Estimation ∗
,
2013
.
[4]
Sebastian Calonico,et al.
Robust Nonparametric Confidence Intervals for Regression‐Discontinuity Designs
,
2014
.
[5]
J. Hahn,et al.
IDENTIFICATION AND ESTIMATION OF TREATMENT EFFECTS WITH A REGRESSION-DISCONTINUITY DESIGN
,
2001
.
[6]
Martyn Plummer,et al.
The R Journal
,
2012
.
[7]
Matias D. Cattaneo,et al.
Randomization Inference in the Regression Discontinuity Design: An Application to Party Advantages in the U.S. Senate
,
2015
.
[8]
G. Imbens,et al.
Large Sample Properties of Matching Estimators for Average Treatment Effects
,
2004
.
[9]
M. F.,et al.
Bibliography
,
1985,
Experimental Gerontology.
[10]
David Card,et al.
Inference on Causal Effects in a Generalized Regression Kink Design
,
2015
.
[11]
David S. Lee.
Randomized experiments from non-random selection in U.S. House elections
,
2005
.
[12]
David S. Lee,et al.
Regression Discontinuity Designs in Economics
,
2009
.
[13]
Jack Porter,et al.
Estimation in the Regression Discontinuity Model
,
2003
.
[14]
James J. Heckman,et al.
Econometric Evaluation of Social Programs, Part I: Causal Models, Structural Models and Econometric Policy Evaluation
,
2007
.
[15]
Matias D. Cattaneo,et al.
Robust Data-Driven Inference in the Regression-Discontinuity Design
,
2014
.
[16]
Yingying Dong,et al.
Jumpy or Kinky? Regression Discontinuity without the Discontinuity
,
2010
.
[17]
Jianqing Fan,et al.
Local polynomial modelling and its applications
,
1994
.