Estimating Interactions with Post-double Selection

In this vignette, we demonstrate how to use the inters package to conduct post-double selection for interactions with linear models. We use the remittances data of Escribà-Folch, Meseguer, and Wright (2018) to illustrate the method, as shown in Blackwell and Olson (2021). The goal of this study was to evaluate how remittances affect political protest differently in democracies and non-democracies.

To begin, we load the data and run two alternative models. The first is a simple single-interaction model that includes the treatment (remittances, remit), the moderator (a binary variable for autocracy, dict), an interaction between these two, and a series of control variables. We use the feols function from the fixest package to handle country and period fixed effects along with clustering at the country level.

data(remit)
single <- feols(Protest ~ remit*dict + l1gdp + l1pop + l1nbr5 + l12gr + l1migr
               + elec3 | period + cowcode, data = remit)
coeftable(single, cluster = ~ caseid)[c("remit", "remit:dict"),]

Next, we compare this single-interaction model to a model that fully interacts the moderator with the entire set of controls, including the fixed effects. Blackwell and Olson (2021) call this the fully moderated model.

fully <- feols(Protest ~ dict * (remit + l1gdp + l1pop + l1nbr5 + l12gr + l1migr +
                                   elec3 + factor(period) + factor(cowcode)),
               data = remit)

coeftable(fully, cluster ~ caseid)[c("remit", "dict:remit"),]

Finally, we compare both of these approaches to that of the post-double-selection estimator of Belloni, Chernozhukov, and Hansen (2014), which uses the lasso to select variables that are important to the outcome, treatment, or the treatment-moderator interaction, then runs a standard least squares regression on those variables selected by the various lasso steps. The post_ds_interactions function implements this procedure and takes character strings with the names of various variables. Furthermore, it can also handle clustered data, which importantly changes the calculation of the penalty parameter in the lasso steps.

controls <- c("l1gdp", "l1pop", "l1nbr5", "l12gr", "l1migr", "elec3")
post_ds_out <- post_ds_interaction(data = remit, treat = "remit", moderator = "dict",
                                outcome = "Protest", control_vars = controls,
                                panel_vars = c("cowcode", "period"),
                                cluster = "caseid")
lmtest::coeftest(post_ds_out, vcov = post_ds_out$clustervcv)[c("remit", "remit_dict"),]

With these results in hand, we can compare the different methods to see that the fully moderated and post-double-selection approaches both provide similar point estimates, with the post-double-selection estimator having slightly less uncertainty. The single-interaction model, on the other hand, leads to a dramatically different conclusion.

References

Belloni, Alexandre, Victor Chernozhukov, and Christian Hansen. 2014. “Inference on Treatment Effects After Selection Among High-Dimensional Controls.” The Review of Economic Studies 81 (2): 608–50. https://doi.org/10.1093/restud/rdt044.

Blackwell, Matthew, and Michael Olson. 2021. “Reducing Model Misspecification and Bias in the Estimation of Interactions.” Political Analysis. https://doi.org/10.1017/pan.2021.19.

Escribà-Folch, Abel, Covadonga Meseguer, and Joseph Wright. 2018. “Remittances and Protest in Dictatorships.” American Journal of Political Science 62 (4): 889–904. https://doi.org/10.1111/ajps.12382.

Estimating Interactions with Post-double Selection

2023-01-10

References