| feglm {capybara} | R Documentation |
GLM fitting with high-dimensional k-way fixed effects
Description
feglm can be used to fit generalized linear models with many high-dimensional fixed effects. The term fixed effect means having one intercept for each level in each category.
Usage
feglm(
formula = NULL,
data = NULL,
family = gaussian(),
weights = NULL,
beta_start = NULL,
eta_start = NULL,
offset = NULL,
control = NULL
)
Arguments
formula |
an object of class "formula": a symbolic description of the model to be fitted. formula
must be of type response ~ slopes | fixed_effects | cluster.
|
data |
an object of class "data.frame" containing the variables in the model. The expected input is a
dataset with the variables specified in formula and a number of rows at least equal to the number of variables
in the model.
|
family |
the link function to be used in the model. Similar to glm.fit this has to be the result
of a call to a family function. Default is gaussian(). See family for details of family
functions.
|
weights |
an optional string with the name of the prior weights variable in data.
|
beta_start |
an optional vector of starting values for the structural parameters in the linear predictor.
Default is \boldsymbol{\beta} = \mathbf{0}.
|
eta_start |
an optional vector of starting values for the linear predictor. |
offset |
an optional formula or numeric vector specifying an a priori known component to be included in the
linear predictor. If a formula, it should be of the form ~ log(variable).
|
control |
a named list of parameters for controlling the fitting process. See fit_control for details. |
Details
If feglm does not converge this is often a sign of linear dependence between one or more regressors and a fixed effects category. In this case, you should carefully inspect your model specification.
Value
A named list of class "feglm". The list contains the following fifteen elements:
coefficients |
a named vector of the estimated coefficients |
eta |
a vector of the linear predictor |
weights |
a vector of the weights used in the estimation |
hessian |
a matrix with the numerical second derivatives |
deviance |
the deviance of the model |
null_deviance |
the null deviance of the model |
conv |
a logical indicating whether the model converged |
iter |
the number of iterations needed to converge |
nobs |
a named vector with the number of observations used in the estimation indicating the dropped and perfectly predicted observations |
fe_levels |
a named vector with the number of levels in each fixed effects |
nms_fe |
a list with the names of the fixed effects variables |
formula |
the formula used in the model |
data |
the data used in the model after dropping non-contributing observations |
family |
the family used in the model |
control |
the control list used in the model |
References
Gaure, S. (2013). "OLS with Multiple High Dimensional Category Variables". Computational Statistics and Data Analysis, 66.
Marschner, I. (2011). "glm2: Fitting generalized linear models with convergence problems". The R Journal, 3(2).
Stammann, A., F. Heiss, and D. McFadden (2016). "Estimating Fixed Effects Logit Models with Large Panel Data". Working paper.
Stammann, A. (2018). "Fast and Feasible Estimation of Generalized Linear Models with High-Dimensional k-Way Fixed Effects". ArXiv e-prints.
Examples
# Model without clustering - uses inverse Hessian for vcov
mod <- feglm(mpg ~ wt | cyl, mtcars, family = poisson(link = "log"))
summary(mod)
# Model with clustering - uses sandwich vcov automatically
mod <- feglm(mpg ~ wt | cyl | am, mtcars, family = poisson(link = "log"))
summary(mod)