Chapter 2 Packages and data
2.1 Packages
Required packages for this workshop:
library(haven) # read dta format (Stata)
library(janitor) # tidy column names
library(dplyr) # chained operations
library(sandwich) # covariance based estimators
library(lmtest) # econometric tests
library(broom) # tidy regression results2.2 Data
We can read directly from Stata files:
gravity <- clean_names(read_dta("data/gravity-data.dta"))Now we need to prepare interval data:
gravity2 <- gravity %>%
filter(year %in% seq(1986, 2006, 4))We are going to need to create and transform some variables that are needed later:
gravity2 <- gravity2 %>%
mutate(
log_trade = log(trade),
log_dist = log(dist)
) %>%
group_by(exporter, year) %>%
mutate(
output = sum(trade),
log_output = log(output)
) %>%
group_by(importer, year) %>%
mutate(
expenditure = sum(trade),
log_expenditure = log(expenditure)
) %>%
ungroup()Before concluiding data preparation, we need to create pair ID and symmetric
pair ID variables. IMPORTANT: Here we don’t need to create pair_id and
symm_id as in Stata, the process is much simpler here (but other tasks will
be harder!)
gravity2 <- gravity2 %>%
mutate(
pair = paste(exporter, importer, sep = "_"),
first = ifelse(exporter < importer, exporter, importer),
second = ifelse(exporter < importer, importer, exporter),
symm = paste(first, second, sep = "_")
)