Asymmetric Poisson Pseudo-Maximum Likelihood (APPML)
library(capybara)
Overview
Standard Poisson Pseudo-Maximum Likelihood (PPML) estimates the
conditional mean of the outcome variable.
fepoisson_asymmetric() extends this by fitting conditional
expectiles β a generalization that lets you examine different parts of
the conditional distribution with the same efficiency gains of
high-dimensional fixed effects.
The implementation follows Bergstrand et al.Β (2025): instead of
minimizing a symmetric loss, the estimator minimizes an
asymmetric weighted loss controlled by a single parameter
expectile (\(\tau \in (0,
1)\)):
- \(\tau = 0.5\) β standard PPML (symmetric, estimates the conditional mean).
- \(\tau < 0.5\) β places more weight on observations with negative residuals, pulling estimates toward the lower part of the distribution.
- \(\tau > 0.5\) β places more weight on observations with positive residuals, shifting estimates toward the upper part of the distribution.
The expectile is set through fit_control()
and defaults to 0.5, so if you donβt specify it,
fepoisson_asymmetric() will behave like standard PPML.
Estimating lower and upper expectiles
Shifting expectile away from 0.5 lets you trace how the
entire conditional distribution of mpg varies with
wt, after absorbing the cyl fixed effects.
ross2004_subset <- ross2004[ross2004$year %in% seq(1989, 1999, 5), ]
ross2004_subset$trade <- exp(ross2004_subset$ltrade)
ross2004_subset$exp_year <- paste0(ross2004_subset$ctry1, ross2004_subset$year)
ross2004_subset$imp_year <- paste0(ross2004_subset$ctry2, ross2004_subset$year)
# 10th expectile β sensitive to small values of mpg
fit10 <- fepoisson_asymmetric(
trade ~ ldist + border + comlang + colony | exp_year + imp_year,
data = ross2004_subset,
control = fit_control(expectile = 0.1)
)
summary(fit10)
Formula: trade ~ ldist + border + comlang + colony | exp_year + imp_year
Family: Poisson
Estimates:
| | Estimate | Std. Error | z value | Pr(>|z|) |
|---------|----------|------------|-------------|-----------|
| ldist | -1.0916 | 0.0000 | -39358.0438 | 0.0000 ** |
| border | 0.3419 | 0.0001 | 5729.9222 | 0.0000 ** |
| comlang | 0.3352 | 0.0001 | 5700.9311 | 0.0000 ** |
| colony | 0.3942 | 0.0001 | 5416.8894 | 0.0000 ** |
Significance codes: ** p < 0.01; * p < 0.05; + p < 0.10
Fixed effects:
exp_year: 457
imp_year: 457
Number of observations: Full 21450; Missing 0; Perfect classification 0
Number of Fisher Scoring iterations: 17
# 90th expectile β sensitive to large values of mpg
fit90 <- fepoisson_asymmetric(
trade ~ ldist + border + comlang + colony | exp_year + imp_year,
data = ross2004_subset,
control = fit_control(expectile = 0.9)
)
summary(fit90)
Formula: trade ~ ldist + border + comlang + colony | exp_year + imp_year
Family: Poisson
Estimates:
| | Estimate | Std. Error | z value | Pr(>|z|) |
|---------|----------|------------|-------------|-----------|
| ldist | -0.9096 | 0.0000 | -45409.0443 | 0.0000 ** |
| border | 0.3160 | 0.0000 | 7382.8698 | 0.0000 ** |
| comlang | 0.1897 | 0.0000 | 4789.5546 | 0.0000 ** |
| colony | 0.4657 | 0.0001 | 7910.0068 | 0.0000 ** |
Significance codes: ** p < 0.01; * p < 0.05; + p < 0.10
Fixed effects:
exp_year: 457
imp_year: 457
Number of observations: Full 21450; Missing 0; Perfect classification 0
Number of Fisher Scoring iterations: 18
Comparing coefficients across expectiles
A natural use-case is comparing how the effect of a regressor changes
across the distribution. The table below collects the coefficient on
wt at the three expectile levels:
summary_table(
fit10, fit90,
model_names = c("10th expectile", "90th expectile")
)
| Variable | 10th expectile | 90th expectile |
|------------------|--------------------|--------------------|
| ldist | -1.092** | -0.910** |
| | (0.000) | (0.000) |
| border | 0.342** | 0.316** |
| | (0.000) | (0.000) |
| comlang | 0.335** | 0.190** |
| | (0.000) | (0.000) |
| colony | 0.394** | 0.466** |
| | (0.000) | (0.000) |
| | | |
| Fixed effects | | |
| exp_year | Yes | Yes |
| imp_year | Yes | Yes |
| | | |
| N | 21,450 | 21,450 |
Standard errors in parenthesis
Significance levels: ** p < 0.01; * p < 0.05; + p < 0.10
The log-distance coefficient shrinks in magnitude from
fit10 to fit90, which indicates that trade
much less with each other at the top of the distribution compared to the
bottom of the distribution.
Convergence diagnostics
The outer APPML loop updates observation weights until the coefficient vector stops changing. You can inspect convergence through the returned list elements:
cat("Outer loop converged:", fit10$conv_outer, "\n")
cat("Outer iterations: ", fit10$iter_outer, "\n")
cat("Expectile used: ", fit10$expectile, "\n")
Outer loop converged: TRUE
Outer iterations: 4
Expectile used: 0.1
References
Bergstrand, Jeffrey H., Matthew W. Clance, and JMC Santos Silva. βThe tails of gravity: Using expectiles to quantify the trade-margins effects of economic integration agreements.β Journal of International Economics (2025): 104145.