Chapter 5. Increasing Returns and the Gravity Equation
Empirical exercise
In this exercise, you are asked to reproduce the empirical results shown in Table 5.2. There are four datasets available: dist.csv which is distances; gdp_ce_93.csv which is GDP in exporting location in 1993; gdp_ci_93.csv which is GDP in importing location in 1993; and trade_93.csv which is trade in 1993. To complete the exercise, these files should be stored in the directory Chapter-5. After this, run the STATA program data_trans.do, which will convert these datasets to STATA files with the same name. The trade data is already converted into US dollars, but GDP data is in Canadian dollars, so this is converted with the exchange rate 1 Canadian dollar = 0.775134 U.S. dollars.
Documentation
US-Canada data for Anderson and van Wincoop (2002)
There are a total of 63 US-Canada regions (states, District of Columbia, provinces and territories). They are listed below. The regressions, however, are based on the same 40 states and provinces as in McCallum (these are indicated with a star below).
Code
State/Province
1
Alabama*
2
Alaska
3
Arizona*
4
Arkansas
5
California*
6
Colorado
7
Connecticut
8
Delaware
9
Florida*
10
Georgia*
11
Hawaii
12
Idaho*
13
Illinois*
14
Indiana*
15
Iowa
16
Kansas
17
Kentucky*
18
Louisiana*
19
Maine*
20
Maryland*
21
Massachusetts*
22
Michigan*
23
Minnesota*
24
Mississippi
25
Missouri*
26
Montana*
27
Nebraska
28
Nevada
29
New Hampshire*
30
New Jersey*
31
New Mexico
32
New York*
33
North Carolina*
34
North Dakota*
35
Ohio*
36
Oklahoma
37
Oregon
38
Pennsylvania*
39
Rhode Island
40
South Carolina
41
South Dakota
42
Tennessee*
43
Texas*
44
Utah
45
Vermont*
46
Virginia*
47
Washington*
48
West Virginia
49
Wisconsin*
50
Wyoming
51
Dist. of Col.
52
Alberta*
53
British Columbia*
54
Manitoba*
55
New Brunswick*
56
Newfoundland*
57
NW Territories
58
Nova Scotia*
59
Ontario*
60
Prince Edward Island*
61
Quebec*
62
Saskatchewan*
63
Yukon Territory
Data files:
dist.csv: Contains distances between the 40 regions listed above. The distances are in kilometers and are between the capitals of the regions.
gdp_ce_93.csv and gdp_ci_93.csv: Contains nominal GDP in millions of Canadian dollars in 1993 for the 40 regions above.
trade_93.csv
Contains 1993 trade data between the 40 regions listed above, in US dollars. The indicator variables 1_ex and 1_im equal 1 if the exporter or importer is a US state, and 2 for a Canadian province.
Exercise 1
Run the program gravity_1.do to replicate the gravity equations in columns (1)-(3) of Table 5.2.
Feenstra’s code
Data transformation:
* Input dataset into STATA and saveasSTATA file *insheetusing Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\dist.csvsort c_e c_isave Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\dist,replaceclearinsheetusing Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\trade_93.csvsort c_e c_imerge c_e c_i using Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\distdrop_mergesort c_e c_isave Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\trade_93,replaceclearinsheetusing Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\gdp_ce_93.csvgen gce=gdp_ce*0.775134drop gdp_ceren gce gdp_cesort c_esave Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\gdp_ce_93,replaceclearinsheetusing Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\gdp_ci_93.csvgen gci=gdp_ci*0.775134drop gdp_ciren gci gdp_cisort c_isave Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\gdp_ci_93,replaceclear
Models:
capturelogcloselogusing Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\gravity_1.log, replacesetmatsize 100use Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\trade_93,clearsort c_emerge c_e using Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\gdp_ce_93drop_mergesort c_imerge c_i using Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\gdp_ci_93drop_mergedropif vx==0dropif dist==0gen lnvx=log(vx)gen lndist=log(dist)gen lngdp_ce=log(gdp_ce)gen lngdp_ci=log(gdp_ci)* Estimate Gravity Equation from the Canadian Perspective *preservegen d_ca=0replace d_ca=1 if (l_ex==2) & (l_im==2)dropif (l_ex==1) & (l_im==1)regress lnvx lngdp_ce lngdp_ci lndist d_carestore* Estimate Gravity Equation from the U.S. Perspective *preservegen d_us=0replace d_us=1 if (l_ex==1) & (l_im==1)dropif (l_ex==2) & (l_im==2)regress lnvx lngdp_ce lngdp_ci lndist d_usrestore* Estimate Gravity Equation by Pooling All Data *preservegen d_ca=0gen d_us=0replace d_ca=1 if (l_ex==2) & (l_im==2)replace d_us=1 if (l_ex==1) & (l_im==1)regress lnvx lngdp_ce lngdp_ci lndist d_ca d_usvcerestoreclearlogclose
Output:
. capturelogclose. logusing Z:\home\pacha\github\advanced-international-trade\first-edition\Chapte> r-5\gravity_1.log, replace(note: file Z:\home\pacha\github\advanced-international-trade\first-edition\Chapte> r-5\gravity_1.lognot found)----------------------------------------------------------------------------------name: <unnamed>log: Z:\home\pacha\github\advanced-international-trade\first-edition\Chapt> er-5\gravity_1.loglogtype: text opened on: 19 Jun 2024, 13:34:35. . setmatsize 100Current memory allocation current memory usage settable value description (1M = 1024k) --------------------------------------------------------------------setmaxvar 5000 max. variables allowed 1.909Msetmemory 50M max. data space 50.000Msetmatsize 100 max. RHS vars in models 0.085M ----------- 51.994M. . use Z:\home\pacha\github\advanced-international-trade\first-edition\Chapter-5\tr> ade_93,clear. sort c_e. merge c_e using Z:\home\pacha\github\advanced-international-trade\first-edition\> Chapter-5\gdp_ce_93(note: you are using old mergesyntax; see [R] mergefornewsyntax)variable c_e does not uniquely identify observations in the master data. drop_merge. sort c_i. merge c_i using Z:\home\pacha\github\advanced-international-trade\first-edition\> Chapter-5\gdp_ci_93(note: you are using old mergesyntax; see [R] mergefornewsyntax)variable c_i does not uniquely identify observations in the master data. drop_merge. dropif vx==0(49 observations deleted). dropif dist==0(40 observations deleted). . gen lnvx=log(vx). gen lndist=log(dist). gen lngdp_ce=log(gdp_ce). gen lngdp_ci=log(gdp_ci). . * Estimate Gravity Equation from the Canadian Perspective *. . preserve. gen d_ca=0. replace d_ca=1 if (l_ex==2) & (l_im==2)(90 real changes made). dropif (l_ex==1) & (l_im==1)(832 observations deleted). . regress lnvx lngdp_ce lngdp_ci lndist d_ca Source | SS df MS Number ofobs = 679-------------+------------------------------ F( 4, 674) = 540.02 Model | 3020.52204 4 755.130511 Prob > F = 0.0000 Residual | 942.471913 674 1.39832628 R-squared = 0.7622-------------+------------------------------ Adj R-squared = 0.7608 Total | 3962.99396 678 5.84512383 Root MSE = 1.1825------------------------------------------------------------------------------ lnvx | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- lngdp_ce | 1.218705 .0331581 36.75 0.000 1.1536 1.283811 lngdp_ci | .9797792 .0325254 30.12 0.000 .9159159 1.043642 lndist | -1.353149 .0690128 -19.61 0.000 -1.488655 -1.217643 d_ca | 2.802034 .1416955 19.78 0.000 2.523816 3.080251_cons | 3.742672 .7721966 4.85 0.000 2.226472 5.258873------------------------------------------------------------------------------. restore. . * Estimate Gravity Equation from the U.S. Perspective *. . preserve. gen d_us=0. replace d_us=1 if (l_ex==1) & (l_im==1)(832 real changes made). dropif (l_ex==2) & (l_im==2)(90 observations deleted). . regress lnvx lngdp_ce lngdp_ci lndist d_us Source | SS df MS Number ofobs = 1421-------------+------------------------------ F( 4, 1416) = 2052.61 Model | 7089.25392 4 1772.31348 Prob > F = 0.0000 Residual | 1222.63635 1416 .863443752 R-squared = 0.8529-------------+------------------------------ Adj R-squared = 0.8525 Total | 8311.89028 1420 5.85344386 Root MSE = .92922------------------------------------------------------------------------------ lnvx | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- lngdp_ce | 1.128429 .020453 55.17 0.000 1.088308 1.16855 lngdp_ci | .9820314 .020396 48.15 0.000 .9420218 1.022041 lndist | -1.081888 .035227 -30.71 0.000 -1.150991 -1.012785 d_us | .4059649 .0578667 7.02 0.000 .2924511 .5194786_cons | 2.659586 .4492747 5.92 0.000 1.77827 3.540901------------------------------------------------------------------------------. restore. . * Estimate Gravity Equation by Pooling All Data *. . preserve. gen d_ca=0. gen d_us=0. replace d_ca=1 if (l_ex==2) & (l_im==2)(90 real changes made). replace d_us=1 if (l_ex==1) & (l_im==1)(832 real changes made). . regress lnvx lngdp_ce lngdp_ci lndist d_ca d_us Source | SS df MS Number ofobs = 1511-------------+------------------------------ F( 5, 1505) = 1732.75 Model | 7499.70876 5 1499.94175 Prob > F = 0.0000 Residual | 1302.79013 1505 .865641282 R-squared = 0.8520-------------+------------------------------ Adj R-squared = 0.8515 Total | 8802.49889 1510 5.82946946 Root MSE = .9304------------------------------------------------------------------------------ lnvx | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- lngdp_ce | 1.132974 .0196797 57.57 0.000 1.094371 1.171577 lngdp_ci | .9742161 .0196294 49.63 0.000 .9357122 1.01272 lndist | -1.110705 .0337347 -32.92 0.000 -1.176877 -1.044533 d_ca | 2.751708 .1086755 25.32 0.000 2.538536 2.964879 d_us | .3982716 .0574423 6.93 0.000 .2855962 .5109471_cons | 2.911512 .4267171 6.82 0.000 2.074488 3.748535------------------------------------------------------------------------------. vceCovariance matrixof coefficients ofregressmodele(V) | lngdp_ce lngdp_ci lndist d_ca d_us -------------+------------------------------------------------------------ lngdp_ce | .00038729 lngdp_ci | .00008279 .00038531 lndist | .00001868 .00001752 .00113803 d_ca | .00041241 .00040103 .00017043 .01181037 d_us | -.00037488 -.00039315 .00039698 .00085625 .00329962 _cons | -.00524428 -.00520461 -.0089485 -.01157481 .00387661 e(V) | _cons-------------+------------_cons | .18208752 . restore. . clear. . logclosename: <unnamed>log: Z:\home\pacha\github\advanced-international-trade\first-edition\Chapt> er-5\gravity_1.loglogtype: text closed on: 19 Jun 2024, 13:34:38----------------------------------------------------------------------------------. . . . endofdo-file
My code
# Packages ----library(archive)library(readr)library(janitor)library(dplyr)# Extract ----fzip <-"first-edition/Chapter-5.zip"dout <-gsub("\\.zip$", "", fzip)if (!dir.exists(dout)) {archive_extract(fzip, dir = dout)}# Read and transform ----fout <-paste0(dout, "/trade_93.rds")if (!file.exists(fout)) {# trade_93 <- read_dta(paste0(dout, "/trade_93.dta"))# gdp_ce_93 <- read_dta(paste0(dout, "/gdp_ce_93.dta"))# gdp_ci_93 <- read_dta(paste0(dout, "/gdp_ci_93.dta"))# instead of reading the DTA files, I will read the CSV files and transform dist <-read_csv(paste0(dout, "/dist.csv")) %>%clean_names() %>%arrange(c_e, c_i) trade_93 <-read_csv(paste0(dout, "/trade_93.csv")) %>%clean_names() %>%arrange(c_e, c_i) trade_93 <- trade_93 %>%left_join(dist, by =c("c_e", "c_i"))rm(dist) gdp_ce_93 <-read_csv(paste0(dout, "/gdp_ce_93.csv")) %>%clean_names() %>%mutate(gdp_ce = gdp_ce *0.775134) %>%arrange(c_e) gdp_ci_93 <-read_csv(paste0(dout, "/gdp_ci_93.csv")) %>%clean_names() %>%mutate(gdp_ci = gdp_ci *0.775134) %>%arrange(c_i) trade_93 <- trade_93 %>%left_join(gdp_ce_93, by ="c_e") %>%left_join(gdp_ci_93, by ="c_i") %>%filter(vx !=0, dist !=0) %>%mutate(lnvx =log(vx),lndist =log(dist),lngdp_ce =log(gdp_ce),lngdp_ci =log(gdp_ci) )saveRDS(trade_93, fout)} else { trade_93 <-readRDS(fout)}# Estimate Gravity Equation from the Canadian Perspective ----trade_93_2 <- trade_93 %>%mutate(d_ca =ifelse(l_ex ==2& l_im ==2, 1, 0)) %>%filter(l_ex !=1| l_im !=1)fit_ca <-lm(lnvx ~ lngdp_ce + lngdp_ci + lndist + d_ca, data = trade_93_2)summary(fit_ca)
Call:
lm(formula = lnvx ~ lngdp_ce + lngdp_ci + lndist + d_ca, data = trade_93_2)
Residuals:
Min 1Q Median 3Q Max
-5.9344 -0.6428 0.0174 0.6225 4.0379
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.74267 0.77220 4.847 1.56e-06 ***
lngdp_ce 1.21871 0.03316 36.754 < 2e-16 ***
lngdp_ci 0.97978 0.03253 30.124 < 2e-16 ***
lndist -1.35315 0.06901 -19.607 < 2e-16 ***
d_ca 2.80203 0.14170 19.775 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.183 on 674 degrees of freedom
Multiple R-squared: 0.7622, Adjusted R-squared: 0.7608
F-statistic: 540 on 4 and 674 DF, p-value: < 2.2e-16
# Estimate Gravity Equation from the U.S. Perspective ----trade_93_3 <- trade_93 %>%mutate(d_us =ifelse(l_ex ==1& l_im ==1, 1, 0)) %>%filter(l_ex !=2| l_im !=2)fit_us <-lm(lnvx ~ lngdp_ce + lngdp_ci + lndist + d_us, data = trade_93_3)summary(fit_us)
Call:
lm(formula = lnvx ~ lngdp_ce + lngdp_ci + lndist + d_us, data = trade_93_3)
Residuals:
Min 1Q Median 3Q Max
-6.2863 -0.4620 -0.0077 0.4822 3.7858
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.65959 0.44927 5.920 4.04e-09 ***
lngdp_ce 1.12843 0.02045 55.172 < 2e-16 ***
lngdp_ci 0.98203 0.02040 48.148 < 2e-16 ***
lndist -1.08189 0.03523 -30.712 < 2e-16 ***
d_us 0.40597 0.05787 7.016 3.54e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.9292 on 1416 degrees of freedom
Multiple R-squared: 0.8529, Adjusted R-squared: 0.8525
F-statistic: 2053 on 4 and 1416 DF, p-value: < 2.2e-16