
Photo by KATRIN BOLOVTSOVA
As discussed in the previous post, RCTs might not be practical in some occasions.
Hence, this is where other techniques can be used to study the causal effect.
In this post, we will be looking at how does difference-in-differences work.
(DS4PS) This method allows us to analyze the effect by taking into consideration:
how the group mean changes before and after a policy intervention and
compare this change with the mean over time of a similar group which did not undergo the treatment (i.e., control group)
The formula of difference-in-differences can be written as follows: \[Y = \beta_0 + \beta_1 * treatment + \beta_2 * post + \beta_3 * treatment * post + \epsilon\]
Where Y is the outcome variable
The key is to check whether the interaction term is statistically significant. If the term is significant, it would indicate that the before and after treatment effect is indeed not zero.
This method is rather easy to understand.
Also, we need not to care about the effect of confounding variables while using this method.
To use such a method, we need to check the following assumptions:
We can check this by using visual inspection.
This can also be checked by running a difference-in-difference analysis on the earlier period.
The interaction term should not be statistically significant, otherwise it would indicate that the parallel trend assumption is being violated.
One issue to take note of is when the groups receive treatment as this could distort the estimate.
To resolve this issue, we could use Goodman-Bacon decomposition to study, which I won’t be exploring this method in this analysis.
Below are some best practices when using difference-in-difference approach (Fernandez 2024):
Be sure outcome trends did not influence the allocation of the treatment/intervention
Acquire more data points before and after to test parallel trend assumption
Use linear probability model to help with interpretability
Be sure to examine composition of population in treatment/intervention and control groups before and after intervention
Use robust standard errors to account for autocorrelation between pre/post in same individual
Perform sub-analysis to see if intervention had similar/different effect on components of the outcome
In this post, I will be using the following packages to perform the analysis.
pacman::p_load(tidyverse, haven, broom)
I will be using the dataset shared by Professor Andrew Heiss on his teaching website. I have also referenced the materials and problem set he posted on the course website.
Refer to this link for the dataset.
eitc <- read_stata("data/eitc.dta") %>%
mutate(children_cat = case_when(
children == 0 ~ "0",
children == 1 ~ "1",
children >= 2 ~ "2+"
))
This dataset comes from US Census’s Current Population Survey. The objective of the analysis is to find out whether the number of employment for women increased after the implementation of Earned Income Tax Credit in 1993.
Below are the definitions of the columns in the dataset:
state: The woman’s state of residence. The numbers are Census/CPS
state numbers: http://unionstats.gsu.edu/State_Code.htmyear: The tax yearurate: The unemployment rate in the woman’s state of residencechildren: The number of children the woman hasnonwhite: Binary variable indicating if the woman is not white (1
= Hispanic/Black)finc: The woman’s family income in 1997 dollarsearn: The woman’s personal income in 1997 dollarsage: The woman’s ageed: The number of years of education the woman hasunearn: The woman’s family income minus her personal income, in
thousands of 1997 dollarseitc <-
eitc %>%
mutate(ind_any_kids = if_else(children_cat == "0", "no", "yes")
,ind_after_1993 = if_else(year <= 1993, "no", "yes"))
eitc %>%
ggplot(aes(children_cat, work)) +
stat_summary(geom = "pointrange", fun.data = "mean_se", fun.args = list(mult = 1.96))

First, I will calculate the average employment by year and whether the women have any kids.
eitc_by_year_kids <-
eitc %>%
group_by(year, ind_any_kids) %>%
summarise(count_total = n()
,work_total = sum(work)) %>%
ungroup() %>%
mutate(work_avg = work_total/count_total)
Then, I will visualize the result by using ggplot function.
ggplot(eitc_by_year_kids, aes(year, work_avg, color = ind_any_kids)) +
geom_line() +
theme_minimal() +
geom_vline(xintercept = 1994) +
annotate("text",x = 1992.9,y = 0.45,label = "<---- Before Treatment") +
annotate("text",x = 1995,y = 0.45,label = "After Treatment ---->") +
xlab("Year") +
ylab("Average Employment") +
labs(title = "Average Employment by Year")

It seems like the pre-treatment trends between the two groups appear to be the same.
This is important as this would allow us to isolate the effect of the tax credit in the later step.
Another method to check the parallel trend pattern is to build a model based on earlier period data.
If the parallel trend assumption holds, then the interaction term would not be statistically significant.
With that, I will build a model for a period between 1991 and 1993.
eitc_fake_treatment <- eitc %>%
filter(year < 1994) %>%
mutate(after_1991 = year >= 1992
,ind_year_1991 = if_else(year <= 1991, "yes", "no"))
model_fit_fake_treat <-
lm(work ~ ind_any_kids + ind_year_1991 + ind_any_kids * ind_year_1991
,data = eitc_fake_treatment)
tidy(model_fit_fake_treat)
# A tibble: 4 x 5
term estimate std.error stati~1 p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.571 0.0110 52.1 0
2 ind_any_kidsyes -0.133 0.0145 -9.19 5.05e-20
3 ind_year_1991yes 0.0117 0.0185 0.631 5.28e- 1
4 ind_any_kidsyes:ind_year_1991yes 0.0101 0.0244 0.415 6.78e- 1
# ... with abbreviated variable name 1: statistic
As shown in the result above, the interaction term is not statistical significant, suggesting that the parallel term assumption is appropriate.
Nevertheless, let’s start by estimating the causal effect.
Next, I will compute the average employment under each group.
eitc_manul_calc <-
eitc %>%
group_by(ind_any_kids, ind_after_1993) %>%
summarise(work_avg = mean(work))
eitc_manul_calc
# A tibble: 4 x 3
# Groups: ind_any_kids [2]
ind_any_kids ind_after_1993 work_avg
<chr> <chr> <dbl>
1 no no 0.575
2 no yes 0.573
3 yes no 0.446
4 yes yes 0.491
Then, I will estimate the causal effect.
eitc_manul_calc[1,3] - eitc_manul_calc[3,3] - (eitc_manul_calc[2,3] - eitc_manul_calc[4,3])
work_avg
1 0.04687313
Instead of calculating manually, we could also fit a regression line to compute the causal effect.
model_fit_simple <-
lm(work ~ ind_after_1993 + ind_any_kids + ind_after_1993 * ind_any_kids
,data = eitc)
tidy(model_fit_simple)
# A tibble: 4 x 5
term estimate std.er~1 stati~2 p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.575 0.00885 65.1 0
2 ind_after_1993yes -0.00207 0.0129 -0.160 8.73e- 1
3 ind_any_kidsyes -0.129 0.0117 -11.1 1.84e-28
4 ind_after_1993yes:ind_any_kidsyes 0.0469 0.0172 2.73 6.31e- 3
# ... with abbreviated variable names 1: std.error, 2: statistic
Ta-da! Now we have computed the causal effect.
Note that the estimated causal effect is the same as the computed figure in the earlier step.
There is statistical evidence that the tax credits increases the average employment for women with kids.
We could take one step further to include other variables in the formula so that we could control their respective effects.
model_fit_complex <-
lm(work ~ ind_after_1993 + ind_any_kids + ind_after_1993 * ind_any_kids + unearn + children + nonwhite + poly(age, 2) + poly(ed, 2)
,data = eitc)
tidy(model_fit_complex)
# A tibble: 11 x 5
term estimate std.e~1 stati~2 p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.699 1.00e-2 69.6 0
2 ind_after_1993yes -0.00868 1.23e-2 -0.704 4.82e- 1
3 ind_any_kidsyes -0.0210 1.47e-2 -1.43 1.52e- 1
4 unearn -0.0178 5.76e-4 -30.8 5.26e-202
5 children -0.0518 4.52e-3 -11.5 2.93e- 30
6 nonwhite -0.0628 8.48e-3 -7.40 1.40e- 13
7 poly(age, 2)1 3.22 5.07e-1 6.36 2.13e- 10
8 poly(age, 2)2 -4.11 4.90e-1 -8.39 5.23e- 17
9 poly(ed, 2)1 4.58 4.85e-1 9.43 4.74e- 21
10 poly(ed, 2)2 1.56 4.79e-1 3.26 1.11e- 3
11 ind_after_1993yes:ind_any_kidsy~ 0.0581 1.64e-2 3.55 3.81e- 4
# ... with abbreviated variable names 1: std.error, 2: statistic
After controlling for other effects, the treatment effect is about 0.058 now. The causal effect of tax credits remains statistically significant.
As shown in the data, the subject could have more than 1 kid.
Therefore, we could split the “having kids indicator” into two so that we could estimate the causal effect of having one kid and more than one kid separately.
eitc <-
eitc %>%
mutate(ind_one_kid = if_else(children_cat == "1", "yes", "no")
,ind_two_plus_kids = if_else(children_cat == "2+", "yes", "no"))
model_fit_two_simple <-
lm(work ~ ind_after_1993 + ind_one_kid + ind_two_plus_kids + ind_after_1993 * ind_one_kid + ind_after_1993 * ind_two_plus_kids
,data = eitc)
tidy(model_fit_two_simple)
# A tibble: 6 x 5
term estimate std.e~1 stati~2 p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.575 0.00881 65.3 0
2 ind_after_1993yes -0.00207 0.0129 -0.161 8.72e- 1
3 ind_one_kidyes -0.0519 0.0150 -3.45 5.56e- 4
4 ind_two_plus_kidsyes -0.179 0.0131 -13.6 4.08e-42
5 ind_after_1993yes:ind_one_kidyes 0.0326 0.0221 1.48 1.40e- 1
6 ind_after_1993yes:ind_two_plus_ki~ 0.0553 0.0193 2.86 4.19e- 3
# ... with abbreviated variable names 1: std.error, 2: statistic
It seems like the treatment effect is greater for women with two or more kids.
Also, the causal effect is only significant for the women with more than 1 kid.
That’s all for the day!
Thanks for reading the post until the end.
Feel free to contact me through email or LinkedIn if you have any suggestions on future topics to share.
Refer to this link for the blog disclaimer.
Till next time, happy learning!

Photo by Karolina Grabowska