Provides different distance measures and dummy variables indicating whether the two countries are contiguous, share a common language or a colonial relationship. There are two kinds of distance measures: simple distances, for which only one city is necessary to calculate international distances; and weighted distances, for which we need data on principal cities in each country. The simple distances are calculated following the great circle formula, which uses latitudes and longitudes of the most important city (in terms of population) or of its official capital. These two variables incorporate internal distances based on areas provided in the `geo_cepii` dataset. The two weighted distance measures use city-level data to assess the geographic distribution of population inside each nation. The idea is to calculate distance between two countries based on bilateral distances between the largest cities of those two countries, those inter-city distances being weighted by the share of the city in the overall country's population. The distance formula used is a generalized mean of city-to-city bilateral distances developed by Head and Mayer (2002), which takes the arithmetic mean and the harmonic means as special cases.

Format

A data frame with 50176 observations on the following 14 variables.

iso_o

Country of origin as ISO codes in three characters.

iso_d

Country of destination as ISO codes in three characters.

contig

Variable coded as 1 when the two countries are next to each other and 0 otherwise.

comlang_off

Variable coded as 1 when the two countries share the same official language.

comlang_ethno

Variable coded as 1 when the two countries have at least 9% of their population speaking the same language.

colony

Variable coded as 1 when the country in `iso_o` was ever a colony of the country in `iso_d`.

comcol

Variable coded as 1 when the two country share the same colonizer after 1945.

curcol

Variable coded as 1 when the country in `iso_o` is a colony of the country in `iso_d`.

col45

Variable coded as 1 when the country in `iso_o` is a colony of the country in `iso_d` after 1945.

smctry

Variable coded as 1 when the two countries were or are the same country.

dist

Simple distance (most populated cities, km)

distcap

Simple distance between capitals (capitals, km)

distw

Weighted distance (pop-wt, km) with theta=1 (theta measures the sensitivity of trade flows to bilateral distance dkl)

distwces

Weighted distance (pop-wt, km) theta=-1.

Source

http://www.cepii.fr/CEPII/en/bdd_modele/download.asp?id=6

References

Mayer, T. & Zignago, S. (2011) Notes on CEPII's distances measures: the GeoDist Database CEPII Working Paper 2011-25

Head, K. & Mayer, T. (2002) Illusory Border Effects: Distance Mismeasurement In-flates Estimates of Home Bias in Trade CEPII Working Paper 2002-01

Examples

# filter countries that share borders dist_cepii[dist_cepii$contig == 1, ]
#> # A tibble: 616 x 14 #> iso_o iso_d contig comlang_off comlang_ethno colony comcol curcol col45 #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 AFG CHN 1 0 0 0 0 0 0 #> 2 AFG IRN 1 1 1 0 0 0 0 #> 3 AFG PAK 1 0 0 0 0 0 0 #> 4 AFG TJK 1 0 0 0 0 0 0 #> 5 AFG TKM 1 0 0 0 0 0 0 #> 6 AFG UZB 1 0 1 0 0 0 0 #> 7 AGO NAM 1 0 0 0 0 0 0 #> 8 AGO ZAR 1 0 0 0 0 0 0 #> 9 AGO ZMB 1 0 0 0 0 0 0 #> 10 ALB GRC 1 0 0 0 0 0 0 #> # ... with 606 more rows, and 5 more variables: smctry <dbl>, dist <dbl>, #> # distcap <dbl>, distw <chr>, distwces <chr>