Comparing multiple (g)lm in one graph #rstats

It’s been a while since a user of my plotting-functions asked whether it would be possible to compare multiple (generalized) linear models in one graph (see comment). While it is already possible to compare multiple models as table output, I now managed to build a function that plots several (g)lm-objects in a single ggplot-graph.

The following examples are take from my sjPlot package which is available on CRAN. Once you’ve installed the package, you can run one of the examples provided in the function’s documentation:

# prepare dummy variables for binary logistic regression
y1 <- ifelse(swiss$Fertility<median(swiss$Fertility), 0, 1)
y2 <- ifelse(swiss$Infant.Mortality<median(swiss$Infant.Mortality), 0, 1)
y3 <- ifelse(swiss$Agriculture<median(swiss$Agriculture), 0, 1)

# Now fit the models. Note that all models share the same predictors
# and only differ in their dependent variable (y1, y2 and y3)
fitOR1 <- glm(y1 ~ swiss$Education+swiss$Examination+swiss$Catholic,
              family=binomial(link="logit"))
fitOR2 <- glm(y2 ~ swiss$Education+swiss$Examination+swiss$Catholic,
              family=binomial(link="logit"))
fitOR3 <- glm(y3 ~ swiss$Education+swiss$Examination+swiss$Catholic,
              family=binomial(link="logit"))

# plot multiple models
sjp.glmm(fitOR1, fitOR2, fitOR3)

multiodds1

Thanks to the help of a stackoverflow user, I now know that the order of aes-parameters matters in case you have dodged positioning of geoms on a discrete scale. An example: I use following code in my function ggplot(finalodds, aes(y=OR, x=xpos, colour=grp, alpha=pa)) to apply different colours to each model and setting an alpha-level for geoms depending on the p-level. If the alpha-aes would appear before the colour-aes, the order of lines representing a model may be different for different x-values (see stackoverflow example).

Another more appealing example (not reproducable, since it relies on data from a current research project):
multiodds2

And finally an example where p-levels are represented by different shapes and non-significant odds have a lower alpha-level:
multiodds3

About these ads

11 Gedanken zu “Comparing multiple (g)lm in one graph #rstats

  1. […] article was first published on Strenge Jacke! » R, and kindly contributed to […]

  2. […] Comparing multiple (g)lm in one graph #rstats | Strenge Jacke! […]

  3. Danilo Freire

    Nice addition to the package! Does sjPlot also creates graphs for multilevel (lme4) or Bayesian models (MCMCpack, MCMCglmm, etc)? Thanks! :)

    • I’ve not worked with mixed effect models yet, but I’ll investigate further the packages you mentioned and see if and how lme4-objects may differ from(g) lm-objects (according to their structure). Then I may include such models as well.

  4. Owusu Michael

    Good work!, I like it

  5. I like the tool and contribution.

    I initially thought the plots would compare different GLMS (focusing on varying different link functions)
    with a fixed set of features. Say, a poisson regression, binomial regression, negative binomial regression; for some count dependent variable

    You’ve presented a GLM (single link) with nested subsets of features.

    I’ve stumbled on an article where the claim is p-values for submodels (after model selection) are inflated

    If you believe that, you might want to NOT include the ‘significance stars’

    I really like the transparent confidence intervals.

    • I’m not sure whether I got your point. For my purposes, the aim is not to compare models in order to select the best one, but to have a fixed set of predictors (e.g. social factors like social status, age, gender…) and analyze their impact on different responses which represent different, independent “research questions”. By comparing the models, you could say that a set of common social factors may have a higher impact on A, but do not explain B. A is associated with social factors, B is not, C partly etc.
      This is a common use case in our research, that’s why I wrote this function – is this what you meant with inflation of p-values for sub models?

    • I think, I got your point. In the forthcoming update, you can use various fitted glm’s with fixed features and differing link-functions, and show the results as html-table (function sjt.glm). You then can compare, for instance, AIC values to see which model fits best.

  6. Hi, so if I understand you correctly you’re doing “MANOVA” of sorts, with different response variables? But I was wondering whether your function could be adapted to compare nested models, and graphically see whether the remaining estimates changes when dropping variables? Anyhow, great work!

    • What this function does, is a simple “forest plot” (see this posting), where several fitted models are shown in one graph, instead of plotting one graph for each model. You can compare a common set of predictors (which is the same for each fitted model) with different response variables.
      Comparing models with stepwise inclusion / dropping of predictors is not possible yet, but I’m planning to write such a function as well. So, next thing would be comparing model with the same response, but different (or better: stepwise included / dropped) predictors.

  7. […] el gráfico anterior está extraído de aquí, la bitácora del autor del paquete sjPlot, con el que se ha generado el gráfico […]

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ photo

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s