Changelog

capybara 2.0.0

I rewrote the the Rank-Revealing Cholesky general method to contribute back to Armadillo.
I found an error when using summary(lm/glm, type = "clustered") that largely underestimated the standard errors. This is now fixed and I merged “clustered” and ” sandwich” types into a single “sandwich” type for clarity and consistency as both use a bread-meat-bread approach.
The InferenceGLM struct now adds the VCOV matrix and the standard errors to reuse computation and
strealine the summary creation. Same for the InferenceLM struct.
All the inference (std. error, R-squared, etc) was moved to C++ side and print() and summary() functions just format the output.
Allows formulas without fixed effects, such as y ~ x1 + x2 and y ~ x1 + x2 | 0 | cluster and formulas without slopes such as y ~ 0 | fe1 + fe2.
Allows offsets in GLMs.
The default is now predict(glm_object, type = "response"), unlike base R behavior.
Most of the R and C++ code was refactored to use memory efficiently.
Follows fixest-based normalization for fixed effects to match Stata results.
Provides the option to use control = list(centering = "berge") or list(centering = "stammann"). Both methods are equivalent but use different internal logics. Berge’s fixed point problem approach is usually faster.
Adds parallelization over columns for an efficient centering regardless of the method used.
Most of the data processing was moved to C++ port to eae portability (e.g., Python version in the future)

capybara 1.8.1

Link to published article and citation info.

capybara 1.8.0

Drops congujate gradient acceleration and uses Irons-Tuck acceleration instead. It is slightly faster.
The benchmarks show a small overhead compared to fixest, which is much smaller memory footprint.

capybara 1.7.0

All the computation is done on C++ side. R does just do the data cleaning/wrangling.
Implements a rank-revealing Cholesky factorisation like fixest.
Returns estimated fixed effects by default (with an option not to).

capybara 1.6.0

Handles collinearities in the model matrix by using a QR decomposition. when Cholesky fails.
It can return NA coefficients when there is collinearity to match base R outputs.

capybara 1.4.0

Adds an extended battery of optional tests for the Poisson model.
Modular code for easier maintenance.

capybara 1.3.0

Explicitly avoids Intel MKL and fallbacks to OpenBLAS to avoid issues with non reproducible results.
Uses OpenMP to parallelize the demeaning functions, which can lead to significant speedups in large datasets.
Uses Irons-Tuck acceleration for fast convergence in the demeaning functions.

capybara 1.2.0

Changes to fit and summary functions to report perfectly classified observations.
Dropped linear dependence checks, leaving it to the Cholesky decomposition to handle it.

capybara 1.1.0

The workhorse demeaning functions were rewritten towards a more efficient implementation. This is based on ppmlhdfe and fixest code.
Loops were avoided and replace with efficient matrix operations.

capybara 1.0.3

Implements some ideas from reghdfe/ppmlhdfe to improve the centering/demeaning functions.

capybara 1.0.2

Small refactors for speed.

capybara 1.0.1

The examples now use smaller datasets to avoid CRAN timeouts with Clang-ASAN.

capybara 1.0.0

Implements a new approach to obtain the rank with a QR decomposition without loss of stability.
Adds different refactors to:
- Streamline the code
- Pass all large objects by reference
- Use BLAS/LAPACK instead of iteration for some operations
Uses a new configure file that works nicely with Intel MKL (i.e. the user does not need to export environment variables for the package to detect MKL).

capybara 0.9.6

Calculates the rank of matrix X based on singular value decomposition instead of QR decomposition. This is more efficient and numerically stable.

capybara 0.9.5

Fixes and expands the ‘weights’ argument in the fe*() functions to allow for different types of weights. The default is still NULL (i.e., all weights equal to 1). The argument now admits weights passed as weights = ~cyl, weights = mtcars$cyl, or w <- mtcars$cyl; weights = w.

capybara 0.9.4

Allows to estimate models without fixed effects.

capybara 0.9.3

Fixes the tidy() method for linear models (felm class). Now it does not require to load the tibble package to work.
Adds a wrapper to present multiple models into a single table with the option to export to LaTeX.

capybara 0.9.2

Implements Irons and Tuck acceleration for fast convergence.

capybara 0.9.1

Fixes a minor uninitialized variable in the C++ code used for a conditional check.

capybara 0.9

First CRAN version
Refactored functions to avoid data copies:
- center variables
- crossprod
- GLM and LM fit
- get alpha
- group sums
- mu eta
- variance
iter_center_max and iter_inner_max now can be modified in feglm_control().

capybara 0.8.0

Dedicated functions for linear models to avoid the overhead of running the GLM function with a Gaussian link.

capybara 0.7.0

The predict method now allows to pass new data to predict the outcome.
Fully documented code and tests according to rOpenSci standards.

capybara 0.6.0

Moves all the heavy computation to C++ using Armadillo and it exports the results to R. Previously, there were multiple data copies between R and C++ that added overhead to the computations.
The previous versions returned MX by default, now it has to be specified.
Adds code to extract the fixed effects with felm objects.

capybara 0.5.2

Uses an O(n log(n)) algorithm to compute the Kendall correlation for the pseudo-R2 in the Poisson model.

capybara 0.5.1

Using arma::field consistently instead of std::vector<std::vector<>> for indices.
Linear algebra changes, such as using arma::inv instead of solving arma::qr for the inverse.
Replaces multiple for loops with dedicated Armadillo functions.

capybara 0.5.0

Avoids for loops in the C++ code, and instead uses Armadillo’s functions.
O(n) computations in C++ access data directly by using pointers.

capybara 0.4.6

Fixes notes from tidyselect regarding the use of all_of().
The C++ code follows a more consistent style.
The GH-Actions do not test gcc 4.8 anymore.

capybara 0.4.5

Ungroups the data to avoid issues with the model matrix

capybara 0.4

Uses R’s C API efficiently to add a bit more of memory optimizations

capybara 0.3.5

Uses Mat consistently for all matrix operations (avoids vectors)

capybara 0.3

Reduces memory footprint ~45% by moving some computation to Armadillo’s side

capybara 0.2

Includes pseudo R2 (same as Stata) for Poisson models

capybara 0.1

Initial CRAN submission.