Poisson Pseudo-Maximum Likelihood (PPML) Model with Cluster-Robust Standard Errors
We will estimate a Poisson Pseudo-Maximum Likelihood (PPML) model using the data available in this package with the idea of replicating the PPML results from Table 3 in Yotov et al. (2016).
This requires to include exporter-time and importer-time fixed effects, and to cluster the standard errors by exporter-importer pairs.
The PPML especification corresponds to: \[\begin{align} X_{ij,t} =& \:\exp\left[\beta_1 \log(DIST)_{i,j} + \beta_2 BORDER_{i,j} +\right.\\ \text{ }& \:\left.\beta_3 COMLANG_{i,j} + \beta_4 COLONY_{i,j} + \pi_{i,t} + \chi_{i,t}\right] \times \varepsilon_{ij,t}. \end{align}\]
Required packages:
library(capybara)
We can use the fepoisson() function to obtain the
estimated coefficients and we add the fixed effects as
| exp_year + imp_year in the formula.
Model estimation:
ross2004_subset <- ross2004[ross2004$year %in% seq(1989, 1999, 5), ]
ross2004_subset$trade <- exp(ross2004_subset$ltrade)
ross2004_subset$exp_year <- paste0(ross2004_subset$ctry1, ross2004_subset$year)
ross2004_subset$imp_year <- paste0(ross2004_subset$ctry2, ross2004_subset$year)
fit <- fepoisson(
trade ~ ldist + border + comlang + colony | exp_year + imp_year,
data = ross2004_subset
)
summary(fit)
Formula: trade ~ ldist + border + comlang + colony | exp_year + imp_year
Family: Poisson
Estimates:
| | Estimate | Std. Error | z value | Pr(>|z|) |
|---------|----------|------------|-------------|-----------|
| ldist | -0.9800 | 0.0000 | -90771.8020 | 0.0000 ** |
| border | 0.3200 | 0.0000 | 13707.7154 | 0.0000 ** |
| comlang | 0.2852 | 0.0000 | 12315.8981 | 0.0000 ** |
| colony | 0.3958 | 0.0000 | 12508.5396 | 0.0000 ** |
Significance codes: ** p < 0.01; * p < 0.05; + p < 0.10
Pseudo R-squared: 0.9591
Fixed effects:
exp_year: 457
imp_year: 457
Number of observations: Full 21450; Missing 0; Perfect classification 0
Number of Fisher Scoring iterations: 10
The coefficients are almost identical to those in Table 3 from Yotov et al. (2016) that were obtained with Stata. The difference is attributed to the different fitting algorithms used by the software. Capybara uses the demeaning algorithm proposed by Stammann (2018).
fit <- fepoisson(
trade ~ ldist + border + comlang + colony | exp_year + imp_year | pair,
data = ross2004_subset
)
summary(fit, type = "clustered")
Formula: trade ~ ldist + border + comlang + colony | exp_year + imp_year |
pair
Family: Poisson
Estimates:
| | Estimate | Std. Error | z value | Pr(>|z|) |
|---------|----------|------------|----------|-----------|
| ldist | -0.9800 | 0.0476 | -20.5747 | 0.0000 ** |
| border | 0.3200 | 0.1077 | 2.9719 | 0.0030 ** |
| comlang | 0.2852 | 0.0881 | 3.2362 | 0.0012 ** |
| colony | 0.3958 | 0.1032 | 3.8344 | 0.0001 ** |
Significance codes: ** p < 0.01; * p < 0.05; + p < 0.10
Pseudo R-squared: 0.9591
Fixed effects:
exp_year: 457
imp_year: 457
Number of observations: Full 21450; Missing 0; Perfect classification 0
Number of Fisher Scoring iterations: 10
The slopes are identical but the standard errors differ from the previous exampke. Capybara clustering algorithm is based on Cameron, Gelbach, and Miller (2011).