Capybara v1.8.0 is now available on CRAN
Because of delays with my scholarship payment, if this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor. If you play the electric guitar, the same scholarship chaos led me to turn my guitar pedals and DIY kits hobby into a business, and you can check those here.
Capybara started as an Alpaca clone that uses cpp11armadillo to be is a fast and small footprint software to fit GLMs with k-way fixed effects.
The software can estimate GLMs from the Exponential Family and also Negative Binomial models, using a demeaning/centering approach that offers a large speedup for models of a large number of fixed effects.
Here is a small benchmark for the following specification using a model from An Advanced Guide to Trade Policy Analysis:
\[\begin{align} \label{eq:benchmarks} X_{ijt} = \exp&\left[\beta_1 \text{RTA}_{ij}^{t-12} + \beta_2 \text{RTA}_{ij}^{t-8} + \beta_3 \text{RTA}_{ij}^{t-4} + \beta_4 \text{RTA}_{ijt} + \right.\\ \:& \left. \pi^{\text{OR}} + \pi^{\text{DE}} + \pi^{\text{DO}} + \pi^{\text{IN86}} + \pi^{\text{IN90}} + \pi^{\text{IN94}} + \right. \nonumber \\ \:& \left. \pi^{\text{IN98}} + \pi^{\text{IN02}} \right], \nonumber \end{align}\]
where:
- \(X_{ijt}\): exports from country \(i\) to country \(j\) at year \(t\)
- \(\text{RTA}_{ijt}\): Regional Trade Agreement between countries \(i\) and \(j\) at time \(t\)
- \(\text{RTA}_{ij}^{t+k}\): RTA between countries \(i\) and \(j\) at time \(t+k\)
- \(\pi^{\text{IN86}}, \pi^{\text{IN90}}, \pi^{\text{IN94}}, \pi^{\text{IN98}}, \pi^{\text{IN02}}\): dummy variables taking the value of one for international trade for each year \(Y\), and zero otherwise.
- \(\pi^{\text{OR}}, \pi^{\text{DE}}, \pi^{\text{DO}}\): exporter-year, importer-year, and exporter-importer fixed effects
To obtain the model coefficients I used the following formula with fixed effects:
form <- trade ~ rta + rta_lag4 + rta_lag8 + rta_lag12 +
intl_border_1986 + intl_border_1990 + intl_border_1994 +
intl_border_1998 + intl_border_2002 |
exp_year + imp_year + pair_id_2I used the same formula with Alpaca, Fixest and Capybara and the dataset from AGTPA, giving me the following time and memory results:
| Package | Median (s) | Mem Alloc (MB) |
|---|---|---|
| Alpaca | 7.17 | 573.0 |
| Fixest | 0.176 | 78.3 |
| Capybara | 0.612 | 24.4 |
Capybara would not exist without Alpaca and it is currently slower than Fixest. While Capybara can be improved, I am happy with its current memory efficiency.
You can install the current Capybara stable version with:
install.packages("capybara")The official documentation is here.
I hope this is useful :)