class: center middle main-title section-title-7 # Regression discontinuity I .class-info[ **Session 10** .light[PMAP 8521: Program evaluation<br> Andrew Young School of Policy Studies ] ] --- name: outline class: title title-inv-8 # Plan for today -- .box-2.medium.sp-after-half[Arbitrary cutoffs and causal inference] -- .box-4.medium.sp-after-half[Drawing lines and measuring gaps] -- .box-6.medium.sp-after-half[Main RDD concerns] --- name: arbitrary-cutoffs class: center middle section-title section-title-2 animated fadeIn # Arbitrary cutoffs<br>and causal inference --- layout: true class: title title-2 --- # Quasi-experiments again -- .box-inv-2.sp-after[Instead of using carefully adjusted DAGs,<br>we can use *context* to isolate/identify the pathway between<br>treatment and outcome in observational data] -- .box-inv-2[Diff-in-diff was one kind of quasi-experiment] .box-2.sp-after[Treatment/control + before/after] -- .box-inv-2[Regression discontinuity designs (RDD) are another] .box-2[Arbitrary rules determine access to programs] --- # Rules to access programs .box-inv-2.medium[Lots of policies and programs are<br>based on arbitrary rules and thresholds] -- .box-2[If you're above the threshold, you're in the program;<br>if you're below, you're not (or vice versa)] --- # Key terms -- .box-inv-2.medium[Running / forcing variable] .box-2.sp-after[Index or measure that determines eligibility] -- .box-inv-2.medium[Cutoff / cutpoint / threshold] .box-2[Number that formally assigns access to program] --- layout: false <img src="10-slides_files/figure-html/rd-dag-1.png" width="100%" style="display: block; margin: auto;" /> --- layout: true class: title title-2 --- # Discontinuities everywhere! .pull-left-wide.small[ <table> <thead> <tr> <th style="text-align:center;"> Size </th> <th style="text-align:center;"> Annual </th> <th style="text-align:center;"> Monthly </th> <th style="text-align:center;"> 138% </th> <th style="text-align:center;"> 150% </th> <th style="text-align:center;"> 200% </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> $12,760 </td> <td style="text-align:center;"> $1,063 </td> <td style="text-align:center;"> $17,609 </td> <td style="text-align:center;"> $19,140 </td> <td style="text-align:center;"> $25,520 </td> </tr> <tr> <td style="text-align:center;"> 2 </td> <td style="text-align:center;"> $17,240 </td> <td style="text-align:center;"> $1,437 </td> <td style="text-align:center;"> $23,791 </td> <td style="text-align:center;"> $25,860 </td> <td style="text-align:center;"> $34,480 </td> </tr> <tr> <td style="text-align:center;"> 3 </td> <td style="text-align:center;"> $21,720 </td> <td style="text-align:center;"> $1,810 </td> <td style="text-align:center;"> $29,974 </td> <td style="text-align:center;"> $32,580 </td> <td style="text-align:center;"> $43,440 </td> </tr> <tr> <td style="text-align:center;"> 4 </td> <td style="text-align:center;"> $26,200 </td> <td style="text-align:center;"> $2,183 </td> <td style="text-align:center;"> $36,156 </td> <td style="text-align:center;"> $39,300 </td> <td style="text-align:center;"> $52,400 </td> </tr> <tr> <td style="text-align:center;"> 5 </td> <td style="text-align:center;"> $30,680 </td> <td style="text-align:center;"> $2,557 </td> <td style="text-align:center;"> $42,338 </td> <td style="text-align:center;"> $46,020 </td> <td style="text-align:center;"> $61,360 </td> </tr> <tr> <td style="text-align:center;"> 6 </td> <td style="text-align:center;"> $35,160 </td> <td style="text-align:center;"> $2,930 </td> <td style="text-align:center;"> $48,521 </td> <td style="text-align:center;"> $52,740 </td> <td style="text-align:center;"> $70,320 </td> </tr> <tr> <td style="text-align:center;"> 7 </td> <td style="text-align:center;"> $39,640 </td> <td style="text-align:center;"> $3,303 </td> <td style="text-align:center;"> $54,703 </td> <td style="text-align:center;"> $59,460 </td> <td style="text-align:center;"> $79,280 </td> </tr> <tr> <td style="text-align:center;"> 8 </td> <td style="text-align:center;"> $44,120 </td> <td style="text-align:center;"> $3,677 </td> <td style="text-align:center;"> $60,886 </td> <td style="text-align:center;"> $66,180 </td> <td style="text-align:center;"> $88,240 </td> </tr> </tbody> </table> ] .pull-right-narrow[ .box-inv-2.smaller[**Medicaid**<br>138%*] .box-inv-2.smaller[**ACA subsidies**<br>138–400%*] .box-inv-2.smaller[**CHIP**<br>200%] .box-inv-2.smaller[**SNAP/Free lunch**<br>130%] .box-inv-2.smaller[**Reduced lunch**<br>130–185%] ] --- # Hypothetical tutoring program -- .box-inv-2.medium[Students take an entrance exam] -- .box-inv-2.medium[Those who score 70 or lower<br>get a free tutor for the year] -- .box-inv-2.medium[Students then take an exit exam<br>at the end of the year] --- layout: false <img src="10-slides_files/figure-html/tutoring-running-1.png" width="100%" style="display: block; margin: auto;" /> --- class: title title-2 # Causal inference intuition .box-inv-2.medium[The people right before and right after the threshold are essentially the same] --- <img src="10-slides_files/figure-html/tutoring-running-threshold-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="10-slides_files/figure-html/tutoring-running-threshold-zoomed-1.png" width="100%" style="display: block; margin: auto;" /> --- class: title title-2 # Causal inference intuition -- .box-inv-2.medium[The people right before and right after the threshold are essentially the same] -- .box-2.medium[Pseudo treatment and control groups!] -- .box-inv-2.medium[Compare outcomes for those<br>right before/after, calculate difference] --- <img src="10-slides_files/figure-html/tutoring-outcome-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="10-slides_files/figure-html/tutoring-outcome-lines-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="10-slides_files/figure-html/tutoring-outcome-delta-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="10-slides_files/figure-html/tutoring-outcome-delta-zoomed-1.png" width="100%" style="display: block; margin: auto;" /> --- layout: true class: title title-2 --- # Geographic discontinuities .center[ <figure> <img src="img/10/timezones-1.png" alt="Holbein time zones" title="Holbein time zones" width="100%"> </figure> ] --- # Geographic discontinuities[ <figure> <img src="img/10/timezones-2.png" alt="Holbein time zones" title="Holbein time zones" width="100%"> </figure> ] .pull-right-narrow[ .box-inv-2[Lower turnout in counties on the eastern side of the boundary] .box-inv-2[Election schedules cause fluctuations in turnout] ] --- # Time discontinuities[ <figure> <img src="img/10/hospitals-1.png" alt="Hospital stays title" title="Hospital stays title" width="90%"> </figure> ] .pull-right-narrow[ .box-inv-2[California requires that insurance cover two days of post-partum hospitalization] .box-inv-2[Does extra time in the hospital improve health outcomes?] ] --- # Time discontinuities .center[ <figure> <img src="img/10/hospitals-2.png" alt="Hospital stays duration" title="Hospital stays duration" width="100%"> </figure> ] .box-inv-2[Delivering at 12:01 AM makes you stay longer in the hospital…] --- # Time discontinuities[ <figure> <img src="img/10/hospitals-3.png" alt="Hospital stays outcomes" title="Hospital stays outcomes" width="65%"> </figure> ] .pull-right-narrow[ .box-inv-2[…but delivering at 12:01 AM has no effect on readmission rates or mortality rates] ] --- # Test score discontinuities[ <figure> <img src="img/10/flagship-1.png" alt="Flagship universities" title="Flagship universities" width="100%"> </figure> ] .pull-right-narrow[ .box-inv-2[Does going to the main state university (e.g. UGA) make you earn more money?] .box-inv-2[SAT scores are an arbitrary cutoff for accessing the university] ] --- # Test score discontinuities[ <figure> <img src="img/10/flagship-2.png" alt="Flagship cutoff" title="Flagship cutoff" width="100%"> </figure> .box-inv-2[Cutoff seems rule-based] ] --[ <figure> <img src="img/10/flagship-3.png" alt="Flagship outcome" title="Flagship outcome" width="100%"> </figure> .box-inv-2[Earnings are slightly higher] ] --- # RDDs are all the rage .box-inv-2.medium[People love these things!] -- .box-2[They're intuitive, compelling, and highly graphical] --[ <figure> <img src="img/10/rdd-p-hacking.png" alt="RDD p-hacking" title="RDD p-hacking" width="80%"> </figure> ] .pull-right[ .box-2[RDD less susceptible to p-hacking and selective publication than DID or IV] ] --- layout: false name: lines-gaps class: center middle section-title section-title-4 animated fadeIn # Drawing lines<br>and measuring gaps --- class: title title-4 # Main goal of RD -- .box-inv-4.medium[Measure the gap in outcome for<br>people on both sides of the cutpoint] -- .box-inv-4.medium[Gap = **δ** =<br>local average treatment effect (LATE)] ---  --- layout: true class: title title-4 --- # Drawing lines -- .box-inv-4.medium[The size of the gap depends on how<br>you draw the lines on each side of the cutoff] -- .box-inv-4.medium.sp-after[The type of lines you choose can<br>change the estimate of δ—sometimes by a lot!] -- .box-4.medium[There's no one right way to draw lines!] --- # Line-drawing considerations -- .box-inv-4.medium[Parametric vs. non-parametric lines] -- .box-inv-4.medium[Measuring the gap] -- .box-inv-4.medium[Bandwidths] -- .box-inv-4.medium[Kernels] --- # Parametric lines .box-inv-4.medium[Formulas with *parameters*] -- .medium[ `$$y = mx + b$$` `$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2$$` ] --- layout: false .small[ `$$y = 10 + 4x$$` ] <img src="10-slides_files/figure-html/params-plot-linear-1.png" width="100%" style="display: block; margin: auto;" /> --- class: title title-4 # Parametric lines .box-inv-4.medium[Not just for straight lines!<br>Make curvy with exponents or trigonometry] -- .medium[ `$$y = \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^7$$` `$$y = \beta_0 + \beta_1 x + \beta_2 \sin(x)$$` ] --- .small[ `$$y = 120 - 3x + 0.07x^2$$` ] <img src="10-slides_files/figure-html/params-plot-square-1.png" width="100%" style="display: block; margin: auto;" /> --- .small[ `$$y = 300 - 25x + 0.65x^2 - 0.004x^3$$` ] <img src="10-slides_files/figure-html/params-plot-poly-1.png" width="100%" style="display: block; margin: auto;" /> --- .small[ `$$y = 10 + 4x + 50 \times \sin (\frac{x}{4})$$` ] <img src="10-slides_files/figure-html/params-plot-trig-1.png" width="100%" style="display: block; margin: auto;" /> --- class: title title-4 # Parametric lines -- .box-inv-4.medium.sp-after[It's important to get the parameters right!] -- .box-inv-4.medium[Line should fit the data pretty well] --- <img src="10-slides_files/figure-html/params-plot-linear-squared-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="10-slides_files/figure-html/params-plot-linear-poly-1.png" width="100%" style="display: block; margin: auto;" /> --- class: title title-4 # Nonparametric lines -- .box-inv-4.medium[Lines without parameters] -- .box-4[Use the data to find the best line,<br>often with windows and moving averages] -- .box-4[<span style="color: #F6D645;">Lo</span>cally <span style="color: #F6D645;">e</span>stimated/<span style="color: #F6D645;">we</span>ighted <span style="color: #F6D645;">s</span>catterplot <span style="color: #F6D645;">s</span>moothing (LOESS/LOWESS)<br>is a common method (but not the only one!)] --- .small[ `$$y = \text{who knows?}$$` ] <img src="10-slides_files/figure-html/params-plot-loess-1.png" width="100%" style="display: block; margin: auto;" /> --- .center[ <video controls> <source src="img/10/loess_window.mp4" type="video/mp4"> </video> ] --- <img src="10-slides_files/figure-html/params-plot-loess-lines-1.png" width="100%" style="display: block; margin: auto;" /> --- layout: true class: title title-4 --- # Measuring gap with parametric lines .center[ <figure> <img src="10-slides_files/figure-html/tutoring-outcome-lines-1.png" alt="Parametric gap" title="Parametric gap" width="85%"> </figure> ] --- # Measuring gap with parametric lines .box-inv-4[Easiest way: center the running variable around the threshold] .small[ <table> <thead> <tr> <th style="text-align:center;"> id </th> <th style="text-align:center;"> exit_exam </th> <th style="text-align:center;"> entrance_exam </th> <th style="text-align:center;"> entrance_centered </th> <th style="text-align:center;"> tutoring </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> 78 </td> <td style="text-align:center;"> 92 </td> <td style="text-align:center;"> 22 </td> <td style="text-align:center;"> FALSE </td> </tr> <tr> <td style="text-align:center;"> 2 </td> <td style="text-align:center;"> 58 </td> <td style="text-align:center;"> 73 </td> <td style="text-align:center;"> 3 </td> <td style="text-align:center;"> FALSE </td> </tr> <tr> <td style="text-align:center;"> 3 </td> <td style="text-align:center;"> 62 </td> <td style="text-align:center;"> 54 </td> <td style="text-align:center;"> -16 </td> <td style="text-align:center;"> TRUE </td> </tr> <tr> <td style="text-align:center;"> 4 </td> <td style="text-align:center;"> 67 </td> <td style="text-align:center;"> 98 </td> <td style="text-align:center;"> 28 </td> <td style="text-align:center;"> FALSE </td> </tr> <tr> <td style="text-align:center;"> 5 </td> <td style="text-align:center;"> 54 </td> <td style="text-align:center;"> 70 </td> <td style="text-align:center;"> 0 </td> <td style="text-align:center;"> TRUE </td> </tr> </tbody> </table> ] .small[ `$$y = \beta_0 + \beta_1 \text{Running variable (centered)} + \beta_2 \text{Indicator for treatment}$$` ] --- # Measuring gap with parametric lines .center[ <figure> <img src="10-slides_files/figure-html/tutoring-outcome-lines-1.png" alt="Parametric gap" title="Parametric gap" width="35%"> </figure> ] .left-code[ ```r program_data <- tutoring %>% mutate(entrance_centered = entrance_exam - 70) model1 <- lm(exit_exam ~ entrance_centered + tutoring, data = program_data) ``` ] .right-code[ ```r tidy(model1) ``` ``` ## # A tibble: 3 × 3 ## term estimate std.error ## <chr> <dbl> <dbl> ## 1 (Intercept) 59.3 0.440 ## 2 entrance_centered 0.514 0.0268 ## 3 tutoringTRUE 11.0 0.802 ``` ] --- # Measuring gap with nonparametric lines .center[ <img src="10-slides_files/figure-html/tutoring-outcome-loess-1.png" width="80%" style="display: block; margin: auto;" /> ] .box-inv-4[Can't use regression; use `rdrobust` R package] --- # Measuring gap with nonparametric lines .center[ <figure> <img src="10-slides_files/figure-html/tutoring-outcome-loess-1.png" alt="Nonparametric gap" title="Nonparametric gap" width="40%"> </figure> ] .small-code[ ```r rdrobust(y = tutoring$exit_exam, x = tutoring$entrance_exam, c = 70) ``` ] .small-code[ ``` ## ============================================================================= ## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## ============================================================================= ## Conventional -9.992 1.708 -5.852 0.000 [-13.339 , -6.646] ## Robust - - -4.992 0.000 [-14.244 , -6.212] ## ============================================================================= ``` ] --- # Bandwidths -- .box-inv-4.medium[All you really care about is the<br>area right around the cutoff] .box-4.sp-after[Observations far away don't matter<br>because they're not comparable] -- .box-inv-4.medium[Bandwidth = window around cutoff] --- layout: false <img src="10-slides_files/figure-html/bandwidth-plots-1.png" width="100%" style="display: block; margin: auto;" /> --- layout: true class: title title-4 --- # Bandwidths -- .box-inv-4.medium.sp-after[Algorithms exist to choose optimal width] -- .box-inv-4.medium[Also use common sense] .box-4.sp-after[Maybe ±5 for the entrance exam?] -- .box-inv-4.medium[For robustness, check what happens<br>if you double and halve the bandwidth] --- # Kernels -- .box-inv-4.medium[Because we care the most about<br>observations right by the cutoff,<br>give more distant ones less weight] -- .box-inv-4.medium[Kernel = method for assigning importance to<br>observations based on distance to the cutoff] --- layout: false <img src="10-slides_files/figure-html/kernel-examples-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="10-slides_files/figure-html/kernel-weighted-points-1.png" width="100%" style="display: block; margin: auto;" /> --- class: title title-4 # Try everything! -- .box-inv-4.medium[Your estimate of δ depends on all these:] -- .box-inv-4[Line type (parametric vs. nonparametric)] .center[ .float-left[.box-inv-4[Bandwidth (wide vs. narrow)] .box-inv-4[Kernel weighting]] ] -- .box-4.medium[Try lots of different combinations!] --- <img src="10-slides_files/figure-html/params-plot-parametric-effects-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="10-slides_files/figure-html/params-plot-bw-effects-1.png" width="100%" style="display: block; margin: auto;" /> --- layout: false name: main-concerns class: center middle section-title section-title-6 animated fadeIn # Main RDD concerns --- layout: true class: title title-6 --- # It's greedy! .box-inv-6.medium[You need *lots* of data,<br>since you're throwing most of it away] .center[ <figure> <img src="10-slides_files/figure-html/bandwidth-plots-1.png" alt="Different bandwidths" title="Different bandwidths" width="60%"> </figure> ] --- # It's limited in scope! .box-inv-6.medium[You're only measuring the ATE<br>for people in the bandwidth] -- .box-6.medium[Local Average Treatment Effect (LATE)] --- # It's limited in scope! .box-inv-6.medium[You can't make population-level<br>claims with a LATE] -- .box-inv-6.smaller[*(But can you really do that with RCTs or diff-in-diff?)*] -- .box-6.medium["The realistic conclusion to draw is that<br>all quantitative empirical results<br>that we encounter are 'local'"] .box-6.small[Angrist and Pischke, *Mostly Harmless Econometrics*, pp. 23–24] --- # Graphics are neat! <img src="10-slides_files/figure-html/too-graphical-plot-1-1.png" width="100%" style="display: block; margin: auto;" /> --- # Which gaps are significant? <img src="10-slides_files/figure-html/too-graphical-plot-2-1.png" width="100%" style="display: block; margin: auto;" /> --- # All of them! <img src="10-slides_files/figure-html/too-graphical-plot-3-1.png" width="100%" style="display: block; margin: auto;" /> --- # Don't rely *only* on graphics .pull-left[ .box-inv-6.medium[Super clear breaks are uncommon] .box-inv-6.medium[Make graphs,<br>but also find the<br>actual δ value] ] .pull-right[ <img src="10-slides_files/figure-html/too-graphical-plot-3-single-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Manipulation! -- .box-inv-6.medium[People might know about the cutoff<br>and change their behavior] -- .box-6[People might fudge numbers or work to<br>cross the threshold to get in/out of program] -- .box-6[If so, those right next to the cutoff are<br>no longer comparable treatment/control groups] --- layout: false class: bg-full background-image: url("img/10/marathons.png") ??? Data from <> and <> --- .center[ <figure> <img src="img/10/basketball.png" alt="NBA shot locations, 2014-15" title="NBA shot locations, 2014-15" width="60%"> </figure> ] ??? <> --- layout: true class: title title-6 --- # Manipulation! .box-inv-6.medium[Check with a McCrary density test] .box-6.small[`rddensity::rdplotdensity()` in R] <img src="10-slides_files/figure-html/manipulation-1.png" width="100%" style="display: block; margin: auto;" /> --- # Noncompliance! -- .box-inv-6.medium[People on the margin of the cutoff<br>might end up in/out of the program] -- .box-6.sp-after[The ACA, subsidies, Medicaid, and 138% of the poverty line] -- .box-inv-6.medium[Sharp vs. fuzzy discontinuities] --- # Sharp discontinuity .box-inv-6[Perfect compliance] <img src="10-slides_files/figure-html/tutoring-sharp-1.png" width="100%" style="display: block; margin: auto;" /> --- # Fuzzy discontinuity .box-inv-6[Imperfect compliance] <img src="10-slides_files/figure-html/tutoring-fuzzy-1.png" width="100%" style="display: block; margin: auto;" /> --- # Fuzzy discontinuities .box-inv-6.medium[Address noncompliance with<br>instrumental variables<br>(more on this later!)] -- .box-6.sp-after[Use an instrument for which side<br>of the cutoff people should be on] -- .box-inv-6[Effect is only for compliers near the cutoff<br>(complier LATE; doubly local effect)]