Different Variance-Covariance Estimators
library(capybara)
A very quick verification of Ross (2004) is to obtain the coefficients for the OLS model from table 1:
felm(ltrade ~ bothin + onein + gsp + ldist + lrgdp + lrgdppc + regional +
custrict + comlang + border + landl + island + lareap + comcol +
curcol + colony + comctry | year, data = ross2004)
#> Formula: ltrade ~ bothin + onein + gsp + ldist + lrgdp + lrgdppc + regional +
#> custrict + comlang + border + landl + island + lareap + comcol +
#> curcol + colony + comctry | year
#> <environment: 0x555bd9aaa400>
#>
#> Estimates:
#>
#> | | Estimate | Std. Error | z value | Pr(>|z|) |
#> |----------|----------|------------|-----------|-----------|
#> | bothin | -0.0423 | 0.0159 | -2.6540 | 0.0080 ** |
#> | onein | -0.0583 | 0.0154 | -3.7722 | 0.0002 ** |
#> | gsp | 0.8585 | 0.0111 | 77.0829 | 0.0000 ** |
#> | ldist | -1.1190 | 0.0061 | -183.0863 | 0.0000 ** |
#> | lrgdp | 0.9159 | 0.0026 | 355.9087 | 0.0000 ** |
#> | lrgdppc | 0.3214 | 0.0039 | 83.2804 | 0.0000 ** |
#> | regional | 1.1988 | 0.0360 | 33.2956 | 0.0000 ** |
#> | custrict | 1.1181 | 0.0374 | 29.8774 | 0.0000 ** |
#> | comlang | 0.3125 | 0.0110 | 28.3085 | 0.0000 ** |
#> | border | 0.5257 | 0.0266 | 19.7328 | 0.0000 ** |
#> | landl | -0.2706 | 0.0093 | -29.0310 | 0.0000 ** |
#> | island | 0.0419 | 0.0095 | 4.4335 | 0.0000 ** |
#> | lareap | -0.0967 | 0.0020 | -47.5692 | 0.0000 ** |
#> | comcol | 0.5846 | 0.0162 | 36.1094 | 0.0000 ** |
#> | curcol | 1.0751 | 0.1067 | 10.0763 | 0.0000 ** |
#> | colony | 1.1638 | 0.0312 | 37.3391 | 0.0000 ** |
#> | comctry | -0.0163 | 0.2623 | -0.0622 | 0.9504 |
#>
#> Significance codes: ** p < 0.01; * p < 0.05; + p < 0.10
#>
#> R-squared : 0.648
#> Adj. R-squared: 0.6479
#>
#> Number of observations: Full 234597; Missing 0; Perfect classification 0
This does not involve many fixed effects. However, capybara will be particularly useful to obtain different standard errors for the same functional form, which is the main focus of table 4B in Cameron & Miller (2014).
Table 4B shows the following clustering methods that we can replicate
with the third part of the model formula
(e.g. y ~ x1 + x2 + ... | fe1 + fe2 | cl1 + cl2) and using
the vcov argument to select how the clustering variables
are used:
vcov value |
Description |
|---|---|
"iid" |
Default OLS (i.i.d. errors) |
"hetero" |
Heteroskedastic-robust (HC0) |
"cluster" |
One-way cluster sandwich |
"m-estimator" |
One-way M-estimator sandwich |
"dyadic" |
Dyadic-robust (Cameron & Miller, 2014) |
Capybara provides an update() method to easily modify
the formula for each model, as we need to change the clustering
variables for each model, which avoids error-prone copy-pasting of the
full formula every time.
fml <- ltrade ~ bothin + onein + gsp + ldist + lrgdp + lrgdppc + regional +
custrict + comlang + border + landl + island + lareap + comcol +
curcol + colony + comctry | year
# IID (no cluster part in formula)
fit_iid <- felm(
fml,
data = ross2004, vcov = "iid"
)
# Heteroskedastic-robust (HC0)
fit_hetero <- felm(
fml,
data = ross2004, vcov = "hetero"
)
# One-way: cluster on country-pair (g,h)
fit_pairs <- felm(
update(fml, . ~ . | . | pair),
data = ross2004, vcov = "cluster"
)
# One-way: cluster on country 1 (g)
fit_ctry1 <- felm(
update(fml, . ~ . | . | ctry1),
data = ross2004, vcov = "cluster"
)
# One-way: cluster on country 2 (h)
fit_ctry2 <- felm(
update(fml, . ~ . | . | ctry2),
data = ross2004, vcov = "cluster"
)
# Two-way: cluster on (g) and (h) simultaneously
fit_2way <- felm(
update(fml, . ~ . | . | ctry1 + ctry2),
data = ross2004, vcov = "cluster"
)
# Dyadic-robust: Cameron-Miller (2014) sandwich with cross-dyad correlations
fit_dyadic <- felm(
update(fml, . ~ . | . | ctry1 + ctry2),
data = ross2004, vcov = "dyadic"
)
With the above models, we can replicate Table 4B from Cameron &
Miller (2014) using summary_table(), another convenience
function in capybara to display multiple models side-by-side. This
completely avoids calling summary() on each model and then
conducting post-processing to extract the relevant information for the
table.
summary_table(
fit_iid, fit_hetero, fit_pairs, fit_ctry1, fit_ctry2, fit_2way, fit_dyadic,
model_names = c("IID", "Hetero", "Pairs", "Country 1", "Country 2", "2-Way", "Dyadic")
)
#> | Variable | IID | Hetero | Pairs | Country 1 | Country 2 | 2-Way | Dyadic |
#> |------------------|--------------------|--------------------------|--------------------|--------------------|--------------------|--------------------|--------------------|
#> | bothin | -0.042** | -0.042* | -0.042 | -0.042 | -0.042 | -0.042 | -0.042 |
#> | | (0.016) | (0.018) | (0.053) | (0.122) | (0.107) | (0.153) | (0.186) |
#> | onein | -0.058** | -0.058** | -0.058 | -0.058 | -0.058 | -0.058 | -0.058 |
#> | | (0.015) | (0.018) | (0.049) | (0.095) | (0.071) | (0.108) | (0.126) |
#> | gsp | 0.859** | 0.859** | 0.859** | 0.859** | 0.859** | 0.859** | 0.859** |
#> | | (0.011) | (0.009) | (0.032) | (0.098) | (0.074) | (0.118) | (0.131) |
#> | ldist | -1.119** | -1.119** | -1.119** | -1.119** | -1.119** | -1.119** | -1.119** |
#> | | (0.006) | (0.006) | (0.022) | (0.051) | (0.050) | (0.068) | (0.078) |
#> | lrgdp | 0.916** | 0.916** | 0.916** | 0.916** | 0.916** | 0.916** | 0.916** |
#> | | (0.003) | (0.003) | (0.010) | (0.024) | (0.028) | (0.035) | (0.043) |
#> | lrgdppc | 0.321** | 0.321** | 0.321** | 0.321** | 0.321** | 0.321** | 0.321** |
#> | | (0.004) | (0.004) | (0.014) | (0.033) | (0.038) | (0.048) | (0.052) |
#> | regional | 1.199** | 1.199** | 1.199** | 1.199** | 1.199** | 1.199** | 1.199** |
#> | | (0.036) | (0.029) | (0.106) | (0.222) | (0.185) | (0.269) | (0.333) |
#> | custrict | 1.118** | 1.118** | 1.118** | 1.118** | 1.118** | 1.118** | 1.118** |
#> | | (0.037) | (0.035) | (0.122) | (0.165) | (0.176) | (0.208) | (0.235) |
#> | comlang | 0.313** | 0.313** | 0.313** | 0.313** | 0.313** | 0.313** | 0.313** |
#> | | (0.011) | (0.011) | (0.040) | (0.081) | (0.065) | (0.096) | (0.111) |
#> | border | 0.526** | 0.526** | 0.526** | 0.526** | 0.526** | 0.526** | 0.526* |
#> | | (0.027) | (0.026) | (0.111) | (0.148) | (0.150) | (0.179) | (0.207) |
#> | landl | -0.271** | -0.271** | -0.271** | -0.271** | -0.271** | -0.271** | -0.271* |
#> | | (0.009) | (0.010) | (0.031) | (0.069) | (0.079) | (0.100) | (0.110) |
#> | island | 0.042** | 0.042** | 0.042 | 0.042 | 0.042 | 0.042 | 0.042 |
#> | | (0.009) | (0.009) | (0.036) | (0.105) | (0.083) | (0.129) | (0.154) |
#> | lareap | -0.097** | -0.097** | -0.097** | -0.097** | -0.097** | -0.097** | -0.097* |
#> | | (0.002) | (0.002) | (0.008) | (0.025) | (0.025) | (0.035) | (0.043) |
#> | comcol | 0.585** | 0.585** | 0.585** | 0.585** | 0.585** | 0.585** | 0.585** |
#> | | (0.016) | (0.019) | (0.067) | (0.126) | (0.108) | (0.152) | (0.178) |
#> | curcol | 1.075** | 1.075** | 1.075** | 1.075* | 1.075** | 1.075* | 1.075* |
#> | | (0.107) | (0.070) | (0.235) | (0.462) | (0.256) | (0.473) | (0.480) |
#> | colony | 1.164** | 1.164** | 1.164** | 1.164** | 1.164** | 1.164** | 1.164** |
#> | | (0.031) | (0.023) | (0.117) | (0.193) | (0.115) | (0.191) | (0.209) |
#> | comctry | -0.016 | -0.016 | -0.016 | -0.016 | -0.016 | -0.016 | -0.016 |
#> | | (0.262) | (0.205) | (1.081) | (0.884) | (1.077) | (0.880) | (0.859) |
#> | | | | | | | | |
#> | Fixed effects | | | | | | | |
#> | year | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
#> | | | | | | | | |
#> | N | 234,597 | 234,597 | 234,597 | 234,597 | 234,597 | 234,597 | 234,597 |
#> | R-squared | 0.648 | 0.648 | 0.648 | 0.648 | 0.648 | 0.648 | 0.648 |
#> | SE type | IID | Heteroskedastic-robust | Cluster-robust | Cluster-robust | Cluster-robust | Cluster-robust | Dyadic-robust |
#>
#> Standard errors in parenthesis
#> Significance levels: ** p < 0.01; * p < 0.05; + p < 0.10