Changelog

capybara 2.0.0

  • Capybara now offers different variance-covariance estimators that do not require to call summary() (e.g., this differens from Alpaca). I added a vignette replicating Cameron and Miller (2014) to show how to use 1-way, 2-way, and dyadic clustering.
  • Improved numerical robustness of convergence criteria across all fitting functions (fepoisson_asymmetric(), feglm_fit(), fenegbin_fit()) to handle FMA (Fused Multiply-Add) compiler optimizations on macOS and ARM. Convergence checks now use hybrid absolute + relative tolerance thresholds that scale appropriately, and bias correction now uses stronger ridge regularization for numerical stability. This eliminates intermittent convergence failures across different platforms and compiler configurations.
  • Besides vcov estimation, capybara now allows to update formulas, which is explained in the vcov vignette. tldr; you can use fml <- mpg ~ wt | am followed by felm(update(fml, . ~ . | cyl), data = mtcars and variations of (same for feglm() and cluster update)
  • I rewrote the the Rank-Revealing Cholesky general method to contribute back to Armadillo.
  • I found an error when using summary(lm/glm, type = "clustered") that largely underestimated the standard errors. This is now fixed and I merged “clustered” and ” sandwich” types into a single “sandwich” type for clarity and consistency as both use a bread-meat-bread approach.
  • The InferenceGLM struct now adds the VCOV matrix and the standard errors to reuse computation and
    strealine the summary creation. Same for the InferenceLM struct.
  • All the inference (std. error, R-squared, etc) was moved to C++ side and print() and summary() functions just format the output.
  • Allows formulas without fixed effects, such as y ~ x1 + x2 and y ~ x1 + x2 | 0 | cluster and formulas without slopes such as y ~ 0 | fe1 + fe2.
  • Allows offsets in GLMs.
  • All the computation is done on C++ side, and now model formulas explicitly fail if there are functions inside them.
  • The default is now predict(glm_object, type = "response"), unlike base R behavior.
  • Most of the R and C++ code was refactored to use memory efficiently.
  • Follows fixest-based normalization for fixed effects to match Stata results.
  • Provides the option to use control = list(centering = "berge") or list(centering = "stammann"). Both methods are equivalent but use different internal logics. Berge’s fixed point problem approach is usually faster.
  • Supports Probit and Logit regression.
  • Adds parallelization over columns for an efficient centering regardless of the method used.
  • A new function fepoisson_asymmetric() to compare coefficients across expectiles (e.g., 10%, 50%, 90%) to weight positive/negative residuals. This is based on “The Tails of Gravity” (10.1016/j.jinteco.2025.104145). The argument expectile_glm_iter_max in fit_control() controls the number of inner GLM iterations per APPML step. Setting it to 1L updates asymmetric weights at every Newton step instead of only after the inner GLM converges, which typically reduces total iterations needed.
  • The summary_table() function now accepts positioning arguments for LaTeX.
  • Improved numerical robustness of convergence criteria across all fitting functions (fepoisson_asymmetric(), feglm_fit(), fenegbin_fit()) to handle FMA (Fused Multiply-Add) compiler optimizations on macOS and ARM. Convergence checks now use hybrid absolute + relative tolerance thresholds that scale appropriately, eliminating intermittent convergence failures across different platforms and compiler configurations.

capybara 1.8.1

  • Link to published article and citation info.

capybara 1.8.0

  • Drops congujate gradient acceleration and uses Irons-Tuck acceleration instead. It is slightly faster.
  • The benchmarks show a small overhead compared to fixest, which is much smaller memory footprint.

capybara 1.7.0

  • All the computation is done on C++ side. R does just do the data cleaning/wrangling.
  • Implements a rank-revealing Cholesky factorisation like fixest.
  • Returns estimated fixed effects by default (with an option not to).

capybara 1.6.0

  • Handles collinearities in the model matrix by using a QR decomposition. when Cholesky fails.
  • It can return NA coefficients when there is collinearity to match base R outputs.

capybara 1.4.0

  • Adds an extended battery of optional tests for the Poisson model.
  • Modular code for easier maintenance.

capybara 1.3.0

  • Explicitly avoids Intel MKL and fallbacks to OpenBLAS to avoid issues with non reproducible results.
  • Uses OpenMP to parallelize the demeaning functions, which can lead to significant speedups in large datasets.
  • Uses Irons-Tuck acceleration for fast convergence in the demeaning functions.

capybara 1.2.0

  • Changes to fit and summary functions to report perfectly classified observations.
  • Dropped linear dependence checks, leaving it to the Cholesky decomposition to handle it.

capybara 1.1.0

  • The workhorse demeaning functions were rewritten towards a more efficient implementation. This is based on ppmlhdfe and fixest code.
  • Loops were avoided and replace with efficient matrix operations.

capybara 1.0.3

  • Implements some ideas from reghdfe/ppmlhdfe to improve the centering/demeaning functions.

capybara 1.0.2

  • Small refactors for speed.

capybara 1.0.1

  • The examples now use smaller datasets to avoid CRAN timeouts with Clang-ASAN.

capybara 1.0.0

  • Implements a new approach to obtain the rank with a QR decomposition without loss of stability.
  • Adds different refactors to:
    • Streamline the code
    • Pass all large objects by reference
    • Use BLAS/LAPACK instead of iteration for some operations
  • Uses a new configure file that works nicely with Intel MKL (i.e. the user does not need to export environment variables for the package to detect MKL).

capybara 0.9.6

  • Calculates the rank of matrix X based on singular value decomposition instead of QR decomposition. This is more efficient and numerically stable.

capybara 0.9.5

  • Fixes and expands the ‘weights’ argument in the fe*() functions to allow for different types of weights. The default is still NULL (i.e., all weights equal to 1). The argument now admits weights passed as weights = ~cyl, weights = mtcars$cyl, or w <- mtcars$cyl; weights = w.

capybara 0.9.4

  • Allows to estimate models without fixed effects.

capybara 0.9.3

  • Fixes the tidy() method for linear models (felm class). Now it does not require to load the tibble package to work.
  • Adds a wrapper to present multiple models into a single table with the option to export to LaTeX.

capybara 0.9.2

  • Implements Irons and Tuck acceleration for fast convergence.

capybara 0.9.1

  • Fixes a minor uninitialized variable in the C++ code used for a conditional check.

capybara 0.9

  • First CRAN version

  • Refactored functions to avoid data copies:

    • center variables
    • crossprod
    • GLM and LM fit
    • get alpha
    • group sums
    • mu eta
    • variance
  • iter_center_max and iter_inner_max now can be modified in feglm_control().

capybara 0.8.0

  • Dedicated functions for linear models to avoid the overhead of running the GLM function with a Gaussian link.

capybara 0.7.0

  • The predict method now allows to pass new data to predict the outcome.
  • Fully documented code and tests according to rOpenSci standards.

capybara 0.6.0

  • Moves all the heavy computation to C++ using Armadillo and it exports the results to R. Previously, there were multiple data copies between R and C++ that added overhead to the computations.
  • The previous versions returned MX by default, now it has to be specified.
  • Adds code to extract the fixed effects with felm objects.

capybara 0.5.2

  • Uses an O(n log(n)) algorithm to compute the Kendall correlation for the pseudo-R2 in the Poisson model.

capybara 0.5.1

  • Using arma::field consistently instead of std::vector<std::vector<>> for indices.
  • Linear algebra changes, such as using arma::inv instead of solving arma::qr for the inverse.
  • Replaces multiple for loops with dedicated Armadillo functions.

capybara 0.5.0

  • Avoids for loops in the C++ code, and instead uses Armadillo’s functions.
  • O(n) computations in C++ access data directly by using pointers.

capybara 0.4.6

  • Fixes notes from tidyselect regarding the use of all_of().
  • The C++ code follows a more consistent style.
  • The GH-Actions do not test gcc 4.8 anymore.

capybara 0.4.5

  • Ungroups the data to avoid issues with the model matrix

capybara 0.4

  • Uses R’s C API efficiently to add a bit more of memory optimizations

capybara 0.3.5

  • Uses Mat consistently for all matrix operations (avoids vectors)

capybara 0.3

  • Reduces memory footprint ~45% by moving some computation to Armadillo’s side

capybara 0.2

  • Includes pseudo R2 (same as Stata) for Poisson models

capybara 0.1

  • Initial CRAN submission.

Loading...