Negative Binomial Regression for Complex Samples (Surveys) #rstats

The survey-package from Thomas Lumley is a great toolkit when analyzing complex samples. It provides svyglm(), to fit generalised linear models to data from a complex survey design. svyglm() covers all families that are also provided by R’s glm() – however, the survey-package has no function to fit negative binomial models, which might be useful for overdispersed count models. Yet, the package provides a generic svymle() to fit user-specified likelihood estimations. In his book, Appendix E, Thomas Lumley describes how to write your own likelihood-function, passed to svymle(), to fit negative binomial models for complex samples. So I wrote a small „wrapper“ and implemented a function svyglm.nb() in my sjstats-package.

# ------------------------------------------
# This example reproduces the results from
# Lumley 2010, figure E.7 (Appendix E, p256)
# ------------------------------------------
library(sjstats)
library(survey)
data(nhanes_sample)

# create survey design
des <- svydesign(
  id = ~ SDMVPSU,
  strat = ~ SDMVSTRA,
  weights = ~ WTINT2YR,
  nest = TRUE,
  data = nhanes_sample
)

# fit negative binomial regression
fit <- svyglm.nb(total ~ factor(RIAGENDR) * (log(age) + factor(RIDRETH1)), des)

# print coefficients and standard errors
round(cbind(coef(fit), survey::SE(fit)), 2)

#>                                          [,1] [,2]
#> theta.(Intercept)                        0.81 0.05
#> eta.(Intercept)                          2.29 0.16
#> eta.factor(RIAGENDR)2                   -0.80 0.18
#> eta.log(age)                             1.07 0.23
#> eta.factor(RIDRETH1)2                    0.08 0.15
#> eta.factor(RIDRETH1)3                    0.09 0.18
#> eta.factor(RIDRETH1)4                    0.82 0.30
#> eta.factor(RIDRETH1)5                    0.06 0.38
#> eta.factor(RIAGENDR)2:log(age)          -1.22 0.27
#> eta.factor(RIAGENDR)2:factor(RIDRETH1)2 -0.18 0.26
#> eta.factor(RIAGENDR)2:factor(RIDRETH1)3  0.60 0.19
#> eta.factor(RIAGENDR)2:factor(RIDRETH1)4  0.06 0.37
#> eta.factor(RIAGENDR)2:factor(RIDRETH1)5  0.38 0.44

The functions returns an object of class svymle, so all methods provided by the survey-package for this class work – it’s just that there are only a few, and common methods like predict() are currently not implemented. Maybe, hopefully, future updates of the survey-package will include such features.

Advertisements

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s