# dataset and summary functions
library(tradepolicy)
# data transformation
library(dplyr)
library(tidyr)
# regression
library(fixest)
2 Partial equilibrium trade policy analysis with structural gravity
2.1 Traditional Gravity Estimates
2.1.1 Preparing the data
If the reader has never used R before, please check chapters 1 to 25 from Wickham and Grolemund (2016).
If the reader has only fitted a few regressions in R, without much practice on transforming and cleaning data before, please check chapters 5 and 18 from Wickham and Grolemund (2016).
Please see the note from page 42 in Yotov et al. (2016). It is a really important note, which tells us that we need to:
- Filter observations for a range of years (1986, 1990, 1994, 1998, 2002 and 2006)
- Transform some variables to logarithm scale (trade and dist) and create new variables from those in the original dataset
- Remove cases where both the exporter and the importer are the same
- Drop observations where the trade flow is zero
Unlike Yotov et al. (2016), here we shall use a single dataset for all the applications and subset its columns depending on what we need. This decision kept the tradepolicy R package as light as possible.
Before conducting any data filtering or regression, we need to load the required packages.
Step 1, including subsetting columns for this application, is straightforward.
<- agtpa_applications %>%
ch1_application1 select(exporter, importer, pair_id, year, trade, dist, cntg, lang, clny) %>%
filter(year %in% seq(1986, 2006, 4))
For step 2, this can be divided in parts, starting with the log transformation of trade and distance.
<- ch1_application1 %>%
ch1_application1 mutate(
log_trade = log(trade),
log_dist = log(dist)
)
Continuing step 2, we can now create the variables
<- ch1_application1 %>%
ch1_application1 # Create Yit
group_by(exporter, year) %>%
mutate(
y = sum(trade),
log_y = log(y)
%>%
) # Create Eit
group_by(importer, year) %>%
mutate(
e = sum(trade),
log_e = log(e)
)
The OLS model with remoteness index needs both exporter and importer index, which grouping variables can create. We divide it into sub-steps: Replicate the computation of total exports, then the remoteness index for exporters, and finally the total imports with the corresponding remoteness index for importers.
<- ch1_application1 %>%
ch1_application1 # Replicate total_e
group_by(exporter, year) %>%
mutate(total_e = sum(e)) %>%
group_by(year) %>%
mutate(total_e = max(total_e)) %>%
# Replicate rem_exp
group_by(exporter, year) %>%
mutate(
remoteness_exp = sum(dist * total_e / e),
log_remoteness_exp = log(remoteness_exp)
%>%
) # Replicate total_y
group_by(importer, year) %>%
mutate(total_y = sum(y)) %>%
group_by(year) %>%
mutate(total_y = max(total_y)) %>%
# Replicate rem_imp
group_by(importer, year) %>%
mutate(
remoteness_imp = sum(dist / (y / total_y)),
log_remoteness_imp = log(remoteness_imp)
)
To create the variables for the OLS with Fixed Effects Model, we followed box #1 on page 44 from Yotov et al. (2016). We combine both exporter and importer variables with the year to create the fixed effects variables.
<- ch1_application1 %>%
ch1_application1 # This merges the columns exporter/importer with year
mutate(
exp_year = paste0(exporter, year),
imp_year = paste0(importer, year)
)
The addition of exporter/importer time fixed effects concludes step 2, and now we need to perform step 3.
<- ch1_application1 %>%
ch1_application1 filter(exporter != importer)
Some cases require conducting step 4, and we will be explicit about it when needed.
2.1.2 OLS estimation ignoring multilateral resistance terms
The general equation for this model is
Please see page 41 in Yotov et al. (2016) for full detail of each variable.
The model for this case is straightforward, and in this case, we need to apply step 4 from the previous section to drop cases where the trade is zero.
<- feols(
fit_ols ~ log_dist + cntg + lang + clny + log_y + log_e,
log_trade data = filter(ch1_application1, trade > 0)
)
summary(fit_ols)
OLS estimation, Dep. Var.: log_trade
Observations: 25,689
Standard-errors: IID
Estimate Std. Error t value Pr(>|t|)
(Intercept) -11.283080 0.151732 -74.36173 < 2.2e-16 ***
log_dist -1.001607 0.014159 -70.74094 < 2.2e-16 ***
cntg 0.573805 0.074427 7.70961 1.3076e-14 ***
lang 0.801548 0.033748 23.75115 < 2.2e-16 ***
clny 0.734853 0.070387 10.44025 < 2.2e-16 ***
log_y 1.190236 0.005402 220.32012 < 2.2e-16 ***
log_e 0.907588 0.005577 162.72688 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 1.74274 Adj. R2: 0.758469
The employed function, feols()
, does not carry a copy of its training data by default besides providing faster fitting for models with fixed effects. This is not the case in base R, where glm()
outputs include this data, increasing the model’s size, but this does not affect the model’s predictions and can be changed as the user needs it (Zumel 2014).
The model is almost ready. We only need to stick to the methodology from Yotov et al. (2016) and cluster the standard errors by country pair (see the note on page 42, it is imperative).
<- feols(
fit_ols ~ log_dist + cntg + lang + clny + log_y + log_e,
log_trade data = filter(ch1_application1, trade > 0),
cluster = ~pair_id
)
summary(fit_ols)
OLS estimation, Dep. Var.: log_trade
Observations: 25,689
Standard-errors: Clustered (pair_id)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -11.283080 0.295827 -38.14076 < 2.2e-16 ***
log_dist -1.001607 0.027340 -36.63526 < 2.2e-16 ***
cntg 0.573805 0.184710 3.10652 1.9158e-03 **
lang 0.801548 0.082102 9.76286 < 2.2e-16 ***
clny 0.734853 0.144193 5.09632 3.7405e-07 ***
log_y 1.190236 0.009456 125.87160 < 2.2e-16 ***
log_e 0.907588 0.009910 91.58459 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 1.74274 Adj. R2: 0.758469
The tradepolicy package provides functions to provide more informative summaries. Please read the documentation of the package and look for the tp_summary_app_1()
function, it summarises the model in the exact way as reported in the book by providing:
- Clustered standard errors.
- Number of observations.
(if applicable).- Presence (or absence) of exporter and exporter-time fixed effects.
- RESET test p-value.
These statistical results are returned as a list to keep it simple, which we can see for the model in the same format as reported in the book.
tp_summary_app_1(
formula = log_trade ~ log_dist + cntg + lang + clny + log_y + log_e,
data = filter(ch1_application1, trade > 0),
method = "ols"
)
|term | estimate| std.error| statistic| p.value|
|:-----------|--------:|---------:|---------:|-------:|
|(Intercept) | -11.283| 0.296| -38.141| 0.000|
|log_dist | -1.002| 0.027| -36.635| 0.000|
|cntg | 0.574| 0.185| 3.107| 0.002|
|lang | 0.802| 0.082| 9.763| 0.000|
|clny | 0.735| 0.144| 5.096| 0.000|
|log_y | 1.190| 0.009| 125.872| 0.000|
|log_e | 0.908| 0.010| 91.585| 0.000|
| nobs| rsquared|etfe |itfe | reset_pval|
|-----:|--------:|:-----|:-----|----------:|
| 25689| 0.759|FALSE |FALSE | 0|
Please notice that the summary hides the exporter/importer fixed effects.
2.1.3 OLS estimation controlling for multilateral resistance terms with remote indexes
The remoteness model adds variables to the OLS model. The general equation for this model is
In the equation above
Please see page 43 in Yotov et al. (2016) for full detail of each variable.
Our approach follows box #1 on page 43 from Yotov et al. (2016). Fitting the regression is straightforward. It is just about adding more regressors to what we did in the last section, and we can create a list with a summary for the model.
tp_summary_app_1(
formula = log_trade ~ log_dist + cntg + lang + clny + log_y + log_e +
+ log_remoteness_imp,
log_remoteness_exp data = filter(ch1_application1, trade > 0),
method = "ols"
)
|term | estimate| std.error| statistic| p.value|
|:------------------|--------:|---------:|---------:|-------:|
|(Intercept) | -35.219| 1.986| -17.731| 0.000|
|log_dist | -1.185| 0.031| -37.892| 0.000|
|cntg | 0.247| 0.177| 1.394| 0.164|
|lang | 0.739| 0.078| 9.429| 0.000|
|clny | 0.842| 0.150| 5.607| 0.000|
|log_y | 1.164| 0.009| 122.819| 0.000|
|log_e | 0.903| 0.010| 91.102| 0.000|
|log_remoteness_exp | 0.972| 0.068| 14.251| 0.000|
|log_remoteness_imp | 0.274| 0.060| 4.578| 0.000|
| nobs| rsquared|etfe |itfe | reset_pval|
|-----:|--------:|:-----|:-----|----------:|
| 25689| 0.765|FALSE |FALSE | 0|
2.1.4 OLS estimation controlling for multilateral resistance terms with fixed effects
The general equation for this model is
Where the added terms, concerning the OLS model, are
We can quickly generate a list as we did with the previous models. The only difference to the previous models is that in this case that the variables to the right of the “|” operator are the fixed effects, which are treated differently by the fixest package, which is used internally by the tradepolicy package, for faster model fitting.
Please notice that the summaries intentionally do not show fixed effects, because there are cases where we have thousands of fixed effects.
tp_summary_app_1(
formula = log_trade ~ log_dist + cntg + lang + clny | exp_year + imp_year,
data = filter(ch1_application1, trade > 0),
method = "ols"
)
|term | estimate| std.error| statistic| p.value|
|:--------|--------:|---------:|---------:|-------:|
|log_dist | -1.216| 0.038| -31.841| 0.000|
|cntg | 0.223| 0.203| 1.100| 0.271|
|lang | 0.661| 0.082| 8.053| 0.000|
|clny | 0.670| 0.149| 4.487| 0.000|
| nobs| rsquared|etfe |itfe | reset_pval|
|-----:|--------:|:----|:----|----------:|
| 25689| 0.843|TRUE |TRUE | 0|
There is another difference when we compare feols()
or fepois()
against glm()
in the presence of fixed effects, which we can explain with an example.
In the data used for the previous summary, we have
When we use feols()
, or any of the functions in the fixest package, a formula of the form
If we do the same in base R, with glm()
, the equivalent formula would be of the form
On the
2.1.5 PPML estimation controlling for multilateral resistance terms with fixed effects
The general equation for this model is
The reason to compute this model, despite the lower speed compared to OLS, is that PPML is the only estimator perfectly consistent with the theoretical gravity model. By estimating with PPML, the fixed effects correspond precisely to the corresponding theoretical terms.
The data for this model is the same as for the fixed effects model, and one option in R is to use the fepois()
function.
<- fepois(trade ~ log_dist + cntg + lang + clny | exp_year + imp_year,
fit_ppml data = ch1_application1,
cluster = ~pair_id
)
Exactly as it was mentioned for feols()
, fepois()
shares the same differences regarding glm()
objects.
If the reader decides to run this model and print the summary, they will notice that it does not report an
Beware that software such as Stata requires additional libraries such as ppmlhdfe to report a correct tp_summary_app_1()
takes the rank correlation between actual and predicted trade flows.
We can obtain a detailed list as in the previous examples.
tp_summary_app_1(
formula = trade ~ log_dist + cntg + lang + clny | exp_year + imp_year,
data = ch1_application1,
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:--------|--------:|---------:|---------:|-------:|
|log_dist | -0.841| 0.032| -26.169| 0.000|
|cntg | 0.437| 0.084| 5.182| 0.000|
|lang | 0.247| 0.078| 3.185| 0.001|
|clny | -0.222| 0.118| -1.886| 0.059|
| nobs| rsquared|etfe |itfe | reset_pval|
|-----:|--------:|:----|:----|----------:|
| 28152| 0.586|TRUE |TRUE | 0.647|
2.2 The “distance puzzle” resolved
2.2.1 Preparing the data
Please see the note from page 47 in Yotov et al. (2016). We need to proceed with similar steps as in the previous section.
The distance puzzle proposes the gravity specification
The difference concerning the last section is that now we need to separate the distance variable into multiple columns that account for discrete-time effects. The pivot_wider()
function.
We need to remove cases where the exporter is the same as the importer and cases where trade is zero for the OLS model. For the PPML models, we need to mark rows where the exporter and the importer are the same, and we need to create the same country column, which is also required to transform the distance variables as shown in box #1 in page 48 from Yotov et al. (2016).
In order to avoid creating two very similar datasets, we shall create one dataset to cover both OLS and PPML.
<- agtpa_applications %>%
ch1_application2 select(exporter, importer, pair_id, year, trade, dist, cntg, lang, clny) %>%
# this filter covers both OLS and PPML
filter(year %in% seq(1986, 2006, 4)) %>%
mutate(
# variables for both OLS and PPML
exp_year = paste0(exporter, year),
imp_year = paste0(importer, year),
year = paste0("log_dist_", year),
log_trade = log(trade),
log_dist = log(dist),
smctry = ifelse(importer != exporter, 0, 1),
# PPML specific variables
log_dist_intra = log_dist * smctry,
intra_pair = ifelse(exporter == importer, exporter, "inter")
%>%
) pivot_wider(names_from = year, values_from = log_dist, values_fill = 0) %>%
mutate(across(log_dist_1986:log_dist_2006, function(x) x * (1 - smctry)))
The across()
function is a shortcut to avoid repetition, as in the following example, we show it for reference without computation.
%>%
ch1_application2 mutate(
log_dist_1986 = log_dist_1986 * (1 - smctry),
log_dist_1990 = log_dist_1990 * (1 - smctry),
# repeat log_dist_T many_times for T = 1994, 1998, ...
log_dist_2006 = log_dist_2006 * (1 - smctry)
)
Note that the OLS model shall require filtering when we specify the model because we skipped filtering the cases where trade is equal to zero and both the importer and the exporter are the same. Because the solution for the “distance puzzle” implies different transformations and filters for the OLS and PPML cases, one possibility is to filter in the same summary functions.
2.2.2 OLS solution for the “distance puzzle”
The gravity specification, which includes
With the data from above, the model specification is straightforward.
tp_summary_app_2(
formula = log_trade ~ log_dist_1986 + log_dist_1990 + log_dist_1994 +
+ log_dist_2002 + log_dist_2006 + cntg + lang + clny |
log_dist_1998 + imp_year,
exp_year data = filter(ch1_application2, importer != exporter, trade > 0),
method = "ols"
)
|term | estimate| std.error| statistic| p.value|
|:-------------|--------:|---------:|---------:|-------:|
|log_dist_1986 | -1.168| 0.044| -26.776| 0.000|
|log_dist_1990 | -1.155| 0.042| -27.295| 0.000|
|log_dist_1994 | -1.211| 0.046| -26.504| 0.000|
|log_dist_1998 | -1.248| 0.043| -29.179| 0.000|
|log_dist_2002 | -1.241| 0.044| -28.143| 0.000|
|log_dist_2006 | -1.261| 0.044| -28.853| 0.000|
|cntg | 0.223| 0.203| 1.100| 0.271|
|lang | 0.661| 0.082| 8.056| 0.000|
|clny | 0.670| 0.149| 4.487| 0.000|
| nobs| pct_chg_log_dist| pcld_std_err| pcld_std_err_pval|intr |csfe |
|-----:|----------------:|------------:|-----------------:|:-----|:-----|
| 25689| 7.95| 3.698| 0.032|FALSE |FALSE |
2.2.3 PPML solution for the “distance puzzle”
This model is very similar to the one specified in the PPML section from the last section. We can directly fit the model.
tp_summary_app_2(
formula = trade ~ 0 + log_dist_1986 + log_dist_1990 + log_dist_1994 +
+ log_dist_2002 + log_dist_2006 + cntg + lang + clny |
log_dist_1998 + imp_year,
exp_year data = filter(ch1_application2, importer != exporter),
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:-------------|--------:|---------:|---------:|-------:|
|log_dist_1986 | -0.859| 0.038| -22.849| 0.000|
|log_dist_1990 | -0.834| 0.038| -21.805| 0.000|
|log_dist_1994 | -0.835| 0.036| -23.436| 0.000|
|log_dist_1998 | -0.847| 0.036| -23.591| 0.000|
|log_dist_2002 | -0.848| 0.032| -26.407| 0.000|
|log_dist_2006 | -0.836| 0.032| -26.342| 0.000|
|cntg | 0.437| 0.084| 5.179| 0.000|
|lang | 0.248| 0.078| 3.185| 0.001|
|clny | -0.222| 0.118| -1.883| 0.060|
| nobs| pct_chg_log_dist| pcld_std_err| pcld_std_err_pval|intr |csfe |
|-----:|----------------:|------------:|-----------------:|:-----|:-----|
| 28152| -2.75| 3.004| 0.36|FALSE |FALSE |
2.2.4 Internal distance solution for the “distance puzzle”
This model requires us to add the internal distance variable to the PPML model and not filter the rows where the exporter and the importer are the same.
tp_summary_app_2(
formula = trade ~ 0 + log_dist_1986 + log_dist_1990 + log_dist_1994 +
+ log_dist_2002 + log_dist_2006 + cntg + lang + clny +
log_dist_1998 | exp_year + imp_year,
log_dist_intra data = ch1_application2,
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:--------------|--------:|---------:|---------:|-------:|
|log_dist_1986 | -0.980| 0.073| -13.404| 0.000|
|log_dist_1990 | -0.940| 0.074| -12.666| 0.000|
|log_dist_1994 | -0.915| 0.073| -12.515| 0.000|
|log_dist_1998 | -0.887| 0.072| -12.298| 0.000|
|log_dist_2002 | -0.884| 0.072| -12.330| 0.000|
|log_dist_2006 | -0.872| 0.072| -12.053| 0.000|
|cntg | 0.371| 0.142| 2.621| 0.009|
|lang | 0.337| 0.171| 1.976| 0.048|
|clny | 0.019| 0.159| 0.121| 0.904|
|log_dist_intra | -0.488| 0.102| -4.779| 0.000|
| nobs| pct_chg_log_dist| pcld_std_err| pcld_std_err_pval|intr |csfe |
|-----:|----------------:|------------:|-----------------:|:----|:-----|
| 28566| -10.965| 1.058| 0|TRUE |FALSE |
2.2.5 Internal distance and home bias solution for the “distance puzzle”
This model requires us to add the same country variable to the internal distance model and repeat the rest of the steps from the last section.
tp_summary_app_2(
formula = trade ~ log_dist_1986 + log_dist_1990 + log_dist_1994 +
+ log_dist_2002 + log_dist_2006 + cntg + lang + clny +
log_dist_1998 + smctry | exp_year + imp_year,
log_dist_intra data = ch1_application2,
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:--------------|--------:|---------:|---------:|-------:|
|log_dist_1986 | -0.857| 0.064| -13.476| 0.000|
|log_dist_1990 | -0.819| 0.064| -12.748| 0.000|
|log_dist_1994 | -0.796| 0.064| -12.349| 0.000|
|log_dist_1998 | -0.770| 0.064| -12.040| 0.000|
|log_dist_2002 | -0.767| 0.064| -12.060| 0.000|
|log_dist_2006 | -0.754| 0.063| -11.950| 0.000|
|cntg | 0.574| 0.157| 3.645| 0.000|
|lang | 0.352| 0.139| 2.534| 0.011|
|clny | 0.027| 0.127| 0.212| 0.832|
|log_dist_intra | -0.602| 0.111| -5.437| 0.000|
|smctry | 1.689| 0.582| 2.901| 0.004|
| nobs| pct_chg_log_dist| pcld_std_err| pcld_std_err_pval|intr |csfe |
|-----:|----------------:|------------:|-----------------:|:----|:-----|
| 28566| -11.969| 1.173| 0|TRUE |FALSE |
2.2.6 Fixed effects solution for the “distance puzzle”
This model requires us to remove the internal distance and same country variables from the last model and include the internal pair variable to account for the intra-national fixed effects.
tp_summary_app_2(
formula = trade ~ 0 + log_dist_1986 + log_dist_1990 + log_dist_1994 +
+ log_dist_2002 + log_dist_2006 + cntg + lang + clny +
log_dist_1998 | exp_year + imp_year,
intra_pair data = ch1_application2,
method = "ppml"
)
The variable 'intra_pairZAF' has been removed because of collinearity (see $collin.var).
|term | estimate| std.error| statistic| p.value|
|:-------------|--------:|---------:|---------:|-------:|
|log_dist_1986 | -0.910| 0.033| -27.738| 0.000|
|log_dist_1990 | -0.879| 0.033| -26.953| 0.000|
|log_dist_1994 | -0.860| 0.032| -26.573| 0.000|
|log_dist_1998 | -0.833| 0.032| -25.889| 0.000|
|log_dist_2002 | -0.829| 0.033| -25.490| 0.000|
|log_dist_2006 | -0.811| 0.033| -24.916| 0.000|
|cntg | 0.442| 0.083| 5.329| 0.000|
|lang | 0.241| 0.077| 3.114| 0.002|
|clny | -0.220| 0.118| -1.861| 0.063|
| nobs| pct_chg_log_dist| pcld_std_err| pcld_std_err_pval|intr |csfe |
|-----:|----------------:|------------:|-----------------:|:----|:----|
| 28566| -10.931| 0.769| 0|TRUE |TRUE |
2.3 Regional trade agreements effects
2.3.1 Preparing the data
This model specification includes gravity covariates, including importer-time and exporter-time fixed effects, as in the equation
In comparison to the previous examples, we need to create additional variables to include fixed effects that account for the observations where the exporter and the importer are the same. These variables are internal border, internal dyad and internal borders for different years.
The direct way of obtaining the desired variables is similar to what we did in the previous sections.
<- agtpa_applications %>%
ch1_application3 filter(year %in% seq(1986, 2006, 4)) %>%
mutate(
exp_year = paste0(exporter, year),
imp_year = paste0(importer, year),
year = paste0("intl_border_", year),
log_trade = log(trade),
log_dist = log(dist),
intl_brdr = ifelse(exporter == importer, pair_id, "inter"),
intl_brdr_2 = ifelse(exporter == importer, 0, 1),
pair_id_2 = ifelse(exporter == importer, "0-intra", pair_id)
%>%
) pivot_wider(names_from = year, values_from = intl_brdr_2, values_fill = 0)
Notice that we used “0-intra” and not just “intra” because the rest of the observations in the internal dyads are numbers 1, …, N, and R internals shall consider “0-intra” as the reference factor for being the first item when it orders the unique observations alphabetically. Also, observe the order of the resulting table, the pivoting of the table will put “0-intra” as the first row for the first exporter-importer dyad. This makes the difference between the expected or other behaviours in the next chapter.
In addition, we need to create the variable containing the trade sum to filter the cases where the sum by dyad is zero.
<- ch1_application3 %>%
ch1_application3 group_by(pair_id) %>%
mutate(sum_trade = sum(trade)) %>%
ungroup()
2.3.2 OLS standard RTA estimates with international trade only
The gravity specification, which includes
With the data from above, the model specification is straightforward.
tp_summary_app_3(
formula = log_trade ~ log_dist + cntg + lang + clny + rta | exp_year +
imp_year,data = filter(ch1_application3, trade > 0, importer != exporter),
method = "ols"
)
|term | estimate| std.error| statistic| p.value|
|:--------|--------:|---------:|---------:|-------:|
|log_dist | -1.216| 0.039| -31.180| 0.000|
|cntg | 0.223| 0.203| 1.099| 0.272|
|lang | 0.661| 0.082| 8.045| 0.000|
|clny | 0.670| 0.149| 4.487| 0.000|
|rta | -0.004| 0.054| -0.081| 0.935|
| nobs| total_rta_effect| trta_std_err| trta_std_err_pval|intr |
|-----:|----------------:|------------:|-----------------:|:-----|
| 25689| -0.004| 0.053| 0.934|FALSE |
2.3.3 PPML standard RTA estimates with international trade only
The model specification is very similar to OLS, and we only need to change the method specified in the function.
tp_summary_app_3(
formula = trade ~ log_dist + cntg + lang + clny + rta | exp_year + imp_year,
data = filter(ch1_application3, importer != exporter),
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:--------|--------:|---------:|---------:|-------:|
|log_dist | -0.822| 0.031| -26.125| 0.000|
|cntg | 0.416| 0.084| 4.944| 0.000|
|lang | 0.250| 0.078| 3.211| 0.001|
|clny | -0.205| 0.116| -1.770| 0.077|
|rta | 0.191| 0.067| 2.855| 0.004|
| nobs| total_rta_effect| trta_std_err| trta_std_err_pval|intr |
|-----:|----------------:|------------:|-----------------:|:-----|
| 28152| 0.191| 0.066| 0.004|FALSE |
2.3.4 Addressing potential domestic trade diversion
The model specification is quite the same as PPML. We only need to add the international border variable but use the entire dataset instead of removing rows where the importer and the exporter are the same.
tp_summary_app_3(
formula = trade ~ log_dist + cntg + lang + clny + rta | exp_year + imp_year +
intl_brdr,data = ch1_application3,
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:--------|--------:|---------:|---------:|-------:|
|log_dist | -0.800| 0.031| -26.025| 0.000|
|cntg | 0.393| 0.080| 4.906| 0.000|
|lang | 0.244| 0.078| 3.111| 0.002|
|clny | -0.182| 0.115| -1.581| 0.114|
|rta | 0.409| 0.070| 5.841| 0.000|
| nobs| total_rta_effect| trta_std_err| trta_std_err_pval|intr |
|-----:|----------------:|------------:|-----------------:|:-----|
| 28566| 0.409| 0.069| 0|FALSE |
2.3.5 Addressing potential endogeneity of RTAs
The model specification includes the RTA variable and the exporter-time, importer-time and internal dyad fixed effects to account for domestic trade.
tp_summary_app_3(
formula = trade ~ rta | exp_year + imp_year + pair_id_2,
data = filter(ch1_application3, sum_trade > 0),
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:----|--------:|---------:|---------:|-------:|
|rta | 0.557| 0.108| 5.138| 0|
| nobs| total_rta_effect| trta_std_err| trta_std_err_pval|intr |
|-----:|----------------:|------------:|-----------------:|:-----|
| 28482| 0.557| 0.102| 0|FALSE |
2.3.6 Testing for potential “reverse causality” between trade and RTAs
We need to modify the previous model to include the forward lagged RTA variable (by four years) and consider where the trade sum is larger than zero.
tp_summary_app_3(
formula = trade ~ rta + rta_lead4 | exp_year + imp_year + pair_id_2,
data = filter(ch1_application3, sum_trade > 0),
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:---------|--------:|---------:|---------:|-------:|
|rta | 0.520| 0.091| 5.709| 0.000|
|rta_lead4 | 0.077| 0.098| 0.792| 0.428|
| nobs| total_rta_effect| trta_std_err| trta_std_err_pval|intr |
|-----:|----------------:|------------:|-----------------:|:-----|
| 28482| 0.597| 0.138| 0|FALSE |
2.3.7 Addressing potential non-linear and phasing-in effects of RTAs
Instead of future-lagged RTA variables, as in the previous model, we modify the previous model and include the RTA backwards lagged variables instead.
tp_summary_app_3(
formula = trade ~ rta + rta_lag4 + rta_lag8 + rta_lag12 | exp_year +
+ pair_id_2,
imp_year data = filter(ch1_application3, sum_trade > 0),
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:---------|--------:|---------:|---------:|-------:|
|rta | 0.291| 0.095| 3.081| 0.002|
|rta_lag4 | 0.414| 0.071| 5.798| 0.000|
|rta_lag8 | 0.169| 0.046| 3.688| 0.000|
|rta_lag12 | 0.119| 0.032| 3.729| 0.000|
| nobs| total_rta_effect| trta_std_err| trta_std_err_pval|intr |
|-----:|----------------:|------------:|-----------------:|:-----|
| 28482| 0.993| 0.094| 0|FALSE |
2.3.8 Addressing globalization effects
In addition to the previous model, we include the international borders on different years besides the lagged RTAs.
tp_summary_app_3(
formula = trade ~ rta + rta_lag4 + rta_lag8 + rta_lag12 + intl_border_1986 +
+ intl_border_1994 + intl_border_1998 + intl_border_2002 |
intl_border_1990 + imp_year + pair_id_2,
exp_year data = filter(ch1_application3, sum_trade > 0),
method = "ppml"
)
|term | estimate| std.error| statistic| p.value|
|:----------------|--------:|---------:|---------:|-------:|
|rta | 0.116| 0.092| 1.258| 0.209|
|rta_lag4 | 0.288| 0.065| 4.399| 0.000|
|rta_lag8 | 0.069| 0.051| 1.356| 0.175|
|rta_lag12 | 0.002| 0.031| 0.076| 0.939|
|intl_border_1986 | -0.706| 0.051| -13.917| 0.000|
|intl_border_1990 | -0.480| 0.046| -10.541| 0.000|
|intl_border_1994 | -0.367| 0.035| -10.327| 0.000|
|intl_border_1998 | -0.158| 0.025| -6.405| 0.000|
|intl_border_2002 | -0.141| 0.018| -7.867| 0.000|
| nobs| total_rta_effect| trta_std_err| trta_std_err_pval|intr |
|-----:|----------------:|------------:|-----------------:|:-----|
| 28482| 0.475| 0.109| 0|FALSE |