Fitting Generalized Linear Models

Efficient Generalized Linear Model Weighted Fit ("eglm.wfit") is used to fit generalized linear models in an equivalent way to "glm.fit" but in a reduced time depending on the design matrix and the family (or link).

eglm.wfit(
  x,
  y,
  weights = rep.int(1, nobs),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  offset = rep.int(0, nobs),
  family = gaussian(),
  control = list(),
  intercept = TRUE,
  singular.ok = TRUE,
  reduce = FALSE
)

Arguments

x, y: For eglm.wfit: x is a design matrix of dimension n * p, and y is a vector of observations of length n, or a matrix with n rows.
weights: an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used with weights weights (that is, minimizing sum(w*e^2)); otherwise ordinary least squares is used.
start: starting values for the parameters in the linear predictor.
etastart: starting values for the linear predictor.
mustart: starting values for the vector of means.
offset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector or matrix of extents matching those of the response. One or more offset terms can be included in the formula instead or as well, and if more than one are specified their sum is used. See model.offset.
family: a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. See family for details of family functions.
control: a list of parameters for controlling the fitting process. For eglm.wfit this is passed to glm.control.
intercept: logical value indicating whether intercept should be included in the null model. Defaults to TRUE.
singular.ok: logical; if FALSE a singular fit is an error.
reduce: logical; if TRUE an alternate design matrix of p * p is used for the fitting instead of the traditional n * p design matrix.

Value

A list that contains the same elements as the output from "glm.fit", with the addition of the vector "good" that indicates with logicals which observations were used in the fitting process.

Details

eglm.wfit is a workhorse function: it is not normally called directly but can be more efficient where the response vector, design matrix and family have already been calculated. Use eglm for most of the cases.

Examples

x <- cbind(rep(1, nrow(mtcars)), mtcars$wt)
y <- mtcars$mpg
eglm.wfit(x, y)
#> $coefficients
#> [1] 37.285126 -5.344472
#> 
#> $residuals
#>  [1] -2.2826106 -0.9197704 -2.0859521  1.2973499 -0.2001440 -0.6932545
#>  [7] -3.9053627  4.1637381  2.3499593  0.2998560 -1.1001440  0.8668731
#> [13] -0.0502472 -1.8830236  1.1733496  2.1032876  5.9810744  6.8727113
#> [19]  1.7461954  6.4219792 -2.6110037 -2.9725862 -3.7268663 -3.4623553
#> [25]  2.4643670  0.3564263  0.1520430  1.2010593 -4.5431513 -2.7809399
#> [31] -3.2053627 -1.0274952
#> 
#> $fitted.values
#>  [1] 23.282611 21.919770 24.885952 20.102650 18.900144 18.793255 18.205363
#>  [8] 20.236262 20.450041 18.900144 18.900144 15.533127 17.350247 17.083024
#> [15]  9.226650  8.296712  8.718926 25.527289 28.653805 27.478021 24.111004
#> [22] 18.472586 18.926866 16.762355 16.735633 26.943574 25.847957 29.198941
#> [29] 20.343151 22.480940 18.205363 22.427495
#> 
#> $effects
#>                                                                               
#> -113.6497374  -29.1157217   -1.6613339    1.6313943    0.1111305   -0.3840041 
#>                                                                               
#>   -3.6072442    4.5003125    2.6905817    0.6111305   -0.7888695    1.1143917 
#>                                                                               
#>    0.2316793   -1.6061571    1.3014525    2.2137818    6.0995633    7.3094734 
#>                                                                               
#>    2.2421594    6.8956792   -2.2010595   -2.6694078   -3.4150859   -3.1915608 
#>                                                                               
#>    2.7346556    0.8200064    0.5948771    1.7073457   -4.2045529   -2.4018616 
#>         <NA>         <NA> 
#>   -2.9072442   -0.6494289 
#> 
#> $R
#>           [,1]      [,2]
#> [1,] -5.656854 -18.19951
#> [2,]  0.000000   5.44782
#> 
#> $rank
#> [1] 2
#> 
#> $qr
#> $qr
#>             [,1]          [,2]
#>  [1,] -5.6568542 -18.199514334
#>  [2,]  0.1767767   5.447820482
#>  [3,]  0.1767767   0.148230003
#>  [4,]  0.1767767  -0.016055881
#>  [5,]  0.1767767  -0.057356801
#>  [6,]  0.1767767  -0.061027994
#>  [7,]  0.1767767  -0.081219555
#>  [8,]  0.1767767  -0.011466889
#>  [9,]  0.1767767  -0.004124504
#> [10,]  0.1767767  -0.057356801
#> [11,]  0.1767767  -0.057356801
#> [12,]  0.1767767  -0.172999378
#> [13,]  0.1767767  -0.110589098
#> [14,]  0.1767767  -0.119767081
#> [15,]  0.1767767  -0.389599760
#> [16,]  0.1767767  -0.421539139
#> [17,]  0.1767767  -0.407037927
#> [18,]  0.1767767   0.170257160
#> [19,]  0.1767767   0.277639553
#> [20,]  0.1767767   0.237256431
#> [21,]  0.1767767   0.121613854
#> [22,]  0.1767767  -0.072041573
#> [23,]  0.1767767  -0.056439003
#> [24,]  0.1767767  -0.130780659
#> [25,]  0.1767767  -0.131698458
#> [26,]  0.1767767   0.218900467
#> [27,]  0.1767767   0.181270739
#> [28,]  0.1767767   0.296362637
#> [29,]  0.1767767  -0.007795696
#> [30,]  0.1767767   0.065628162
#> [31,]  0.1767767  -0.081219555
#> [32,]  0.1767767   0.063792566
#> 
#> $rank
#> [1] 2
#> 
#> $qraux
#> [1] 1.176777 1.046354
#> 
#> $pivot
#> [1] 1 2
#> 
#> $tol
#> [1] 1e-11
#> 
#> attr(,"class")
#> [1] "qr"
#> 
#> $family
#> 
#> Family: gaussian 
#> Link function: identity 
#> 
#> 
#> $linear.predictors
#>  [1] 23.282611 21.919770 24.885952 20.102650 18.900144 18.793255 18.205363
#>  [8] 20.236262 20.450041 18.900144 18.900144 15.533127 17.350247 17.083024
#> [15]  9.226650  8.296712  8.718926 25.527289 28.653805 27.478021 24.111004
#> [22] 18.472586 18.926866 16.762355 16.735633 26.943574 25.847957 29.198941
#> [29] 20.343151 22.480940 18.205363 22.427495
#> 
#> $deviance
#> [1] 278.3219
#> 
#> $aic
#> [1] 166.0294
#> 
#> $null.deviance
#> [1] 1126.047
#> 
#> $iter
#> [1] 2
#> 
#> $weights
#>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 
#> $prior.weights
#>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 
#> $df.residual
#> [1] 30
#> 
#> $df.null
#> [1] 31
#> 
#> $y
#>  [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
#> [16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
#> [31] 15.0 21.4
#> 
#> $converged
#> [1] TRUE
#> 
#> $boundary
#> [1] FALSE
#> 
#> $good
#>  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [31] TRUE TRUE
#>