Poisson Pseudo-Maximum Likelihood (PPML) Model with Cluster-Robust Standard Errors
Source:vignettes/intro.Rmd
intro.Rmd
We will estimate a Poisson Pseudo-Maximum Likelihood (PPML) model using the data available in this package with the idea of replicating the PPML results from Table 3 in Yotov et al. (2016).
This requires to include exporter-time and importer-time fixed effects, and to cluster the standard errors by exporter-importer pairs.
The PPML especification corresponds to:
We use dplyr
to obtain the log of the distance. This model excludes domestic flows, therefore we need to subset the data also with dplyr
.
Required packages:
We can use the fepoisson()
function to obtain the estimated coefficients and we add the fixed effects as | exp_year + imp_year
in the formula.
Model estimation:
fit <- fepoisson(
trade ~ log_dist + cntg + lang + clny + rta | exp_year + imp_year,
data = trade_panel
)
summary(fit)
#> Formula: trade ~ log_dist + cntg + lang + clny + rta | exp_year + imp_year
#>
#> Family: Poisson
#>
#> Estimates:
#>
#> | | Estimate | Std. Error | z value | Pr(>|z|) |
#> |----------|----------|------------|------------|------------|
#> | log_dist | -0.8216 | 0.0004 | -2194.0442 | 0.0000 *** |
#> | cntg | 0.4155 | 0.0009 | 476.0613 | 0.0000 *** |
#> | lang | 0.2499 | 0.0008 | 296.8883 | 0.0000 *** |
#> | clny | -0.2054 | 0.0010 | -206.3476 | 0.0000 *** |
#> | rta | 0.1907 | 0.0010 | 191.0963 | 0.0000 *** |
#>
#> Significance codes: *** 99.9%; ** 99%; * 95%; . 90%
#>
#> Pseudo R-squared: 0.5749
#>
#> Number of observations: Full 28152; Missing 0; Perfect classification 0
#>
#> Number of Fisher Scoring iterations: 12
The coefficients are almost identical to those in Table 3 from Yotov et al. (2016) that were obtained with Stata. The difference is attributed to the different fitting algorithms used by the software. Capybara uses the demeaning algorithm proposed by Stammann (2018).
fit <- fepoisson(
trade ~ log_dist + cntg + lang + clny + rta | exp_year + imp_year | pair,
data = trade_panel
)
summary(fit, type = "clustered")
#> Formula: trade ~ log_dist + cntg + lang + clny + rta | exp_year + imp_year |
#> pair
#>
#> Family: Poisson
#>
#> Estimates:
#>
#> | | Estimate | Std. Error | z value | Pr(>|z|) |
#> |----------|----------|------------|----------|------------|
#> | log_dist | -0.8216 | 0.0258 | -31.8227 | 0.0000 *** |
#> | cntg | 0.4155 | 0.0673 | 6.1778 | 0.0000 *** |
#> | lang | 0.2499 | 0.0623 | 4.0077 | 0.0001 *** |
#> | clny | -0.2054 | 0.0914 | -2.2475 | 0.0246 * |
#> | rta | 0.1907 | 0.0554 | 3.4440 | 0.0006 *** |
#>
#> Significance codes: *** 99.9%; ** 99%; * 95%; . 90%
#>
#> Pseudo R-squared: 0.5749
#>
#> Number of observations: Full 28152; Missing 0; Perfect classification 0
#>
#> Number of Fisher Scoring iterations: 12
The result is similar and the numerical difference comes fom the variance-covariance matrix estimation method. Capybara clustering algorithm is based on Cameron, Gelbach, and Miller (2011).
References
Cameron, A Colin, Jonah B Gelbach, and Douglas L Miller. 2011. “Robust Inference with Multiway Clustering.” Journal of Business & Economic Statistics 29 (2): 238–49.
Stammann, Amrei. 2018. “Fast and Feasible Estimation of Generalized Linear Models with High-Dimensional K-Way Fixed Effects.” arXiv. https://doi.org/10.48550/arXiv.1707.01815.
Yotov, Yoto V, Roberta Piermartini, Mario Larch, and others. 2016. An Advanced Guide to Trade Policy Analysis: The Structural Gravity Model. WTO iLibrary.