Quickly create Codeplans of your (labelled) Data #rstats

The view_df() function from the sjPlot-package creates nice „codeplans“ from your data sets, and also supports labelled data and tagged NA-values. This gives you a comprehensive, yet clear overview of your data set.

To demonstrate this function, we use a (labelled) data set from the European Social Survey. view_df() produces a HTML-file, that is – when you use RStudio – displayed in the viewer pane, or it can be opened in your webbrowser.

Weiterlesen „Quickly create Codeplans of your (labelled) Data #rstats“

Werbeanzeigen

R and labelled data: Using quasiquotation to add variable and value labels #rstats

Labelling data is typically a task for end-users and is applied in own scripts or functions rather than in packages. However, sometimes it can be useful for both end-users and package developers to have a flexible way to add variable and value labels to their data. In such cases, quasiquotation is helpful.

This vignette demonstrate how to use quasiquotation in sjlabelled to label your data.

Adding value labels to variables using quasiquotation

Usually, set_labels() can be used to add value labels to variables. The syntax of this function is easy to use, and set_labels()allows to add value labels to multiple variables at once, if these variables share the same value labels.

Weiterlesen „R and labelled data: Using quasiquotation to add variable and value labels #rstats“

ggeffects 0.8.0 now on CRAN: marginal effects for regression models #rstats

I’m happy to announce that version 0.8.0 of my ggeffects-package is on CRAN now. The update has fixed some bugs from the previous version and comes along with many new features or improvements. One major part that was addressed in the latest version are fixed and improvements for mixed models, especially zero-inflated mixed models (fitted with the glmmTMB-package).

In this post, I want to demonstrate the different options to calculate and visualize marginal effects from mixed models.

Weiterlesen „ggeffects 0.8.0 now on CRAN: marginal effects for regression models #rstats“

Marginal Effects for (mixed effects) regression models #rstats

ggeffects (CRAN, website) is a package that computes marginal effects at the mean (MEMs) or representative values (MERs) for many different models, including mixed effects or Bayesian models. One of the advantages of the package is its easy-to-use interface: No matter if you fit a simple or complex model, with interactions or splines, the function call is always the same. This also holds true for the returned output, which is always a data frame with the same, consistent column names.

The past package-update introduced some new features I wanted to describe here: a revised print()-method as well as a new opportunity to plot marginal effects at different levels of random effects in mixed models…

Weiterlesen „Marginal Effects for (mixed effects) regression models #rstats“

Marginal Effects for Regression Models in R #rstats #dataviz

Regression coefficients are typically presented as tables that are easy to understand. Sometimes, estimates are difficult to interpret. This is especially true for interaction or transformed terms (quadratic or cubic terms, polynomials, splines), in particular for more complex models. In such cases, coefficients are no longer interpretable in a direct way and marginal effects are far easier to understand. Specifically, the visualization of marginal effects makes it possible to intuitively get the idea of how predictors and outcome are associated, even for complex models.

The ggeffects-package (Lüdecke 2018) aims at easily calculating marginal effects for a broad range of different regression models, beginning with classical models fitted with lm() or glm() to complex mixed models fitted with lme4 and glmmTMB or even Bayesian models from brms and rstanarm. The goal of the ggeffects-package is to provide a simple, user-friendly interface to calculate marginal effects, which is mainly achieved by one function: ggpredict(). Independent from the type of regression model, the output is always the same, a data frame with a consistent structure.

Weiterlesen „Marginal Effects for Regression Models in R #rstats #dataviz“

R functions for Bayesian Model Statistics and Summaries #rstats #stan #brms

A new update of my sjstats-package just arrived at CRAN. This blog post demontrates those functions of the sjstats-package that deal especially with Bayesian models. The update contains some new and some revised functions to compute summary statistics of Bayesian models, which are now described in more detail.

  • hdi()
  • rope()
  • mcse()
  • n_eff()
  • tidy_stan()
  • equi_test()
  • mediation()
  • icc()
  • r2()

Before we start, we fit some models, including a mediation-object from the mediation-package, which we use for comparison with brms. The functions work with brmsfit, stanreg and stanfit-objects.

Weiterlesen „R functions for Bayesian Model Statistics and Summaries #rstats #stan #brms“

Data transformation in #tidyverse style: package sjmisc updated #rstats

I’m pleased to announce an update for the sjmisc-package, which was just released on CRAN. Here I want to point out two important changes in the package.

New default option for recoding and transformation functions

First, a small change in the code with major impact on the workflow, as it affects argument defaults and is likely to break your existing code – if you’re using sjmisc: The append-argument in recode and transformation functions like rec(), dicho(), split_var(), group_var(), center(), std(), recode_to(), row_sums(), row_count(), col_count() and row_means() now defaults to TRUE.

The reason behind this change is that, in my experience and workflow, when transforming or recoding variables, I typically want to add these new variables to an existing data frame by default. Especially in a pipe-workflow, when I start my scripts with importing and basic tidying of my data, I almost always want to append the recoded variables to my existing data, e.g.:

# Example with following steps:
# 1. loading labelled data set
# 2. dropping unused labels
# 3. converting numeric into categorical, using labels as levels
# 4. center some variables
# 5. recode some other variables
data %>%
  drop_labels() %>%
  as_label(var1:var5) %>%
  center(var7, var9) %>%
  rec(var11, rec = "2=0;1=1;else=copy")

Weiterlesen „Data transformation in #tidyverse style: package sjmisc updated #rstats“

Bayesian Regression Modelling in R: Choosing informative priors in rstanarm #rstats

Yesterday, at the last meeting of the Hamburg R User Group in this year, I had the pleasure to give a talk about Bayesian modelling and choosing (informative) priors in the rstanarm-package.

You can download the slides of my talk here.

Thanks to the Stan team and Tristan for proof reading my slides prior (<- hoho) to the talk. Disclaimer: Still, I'm fully responsible for the content of the slides, and I'm to blame for any false statements or errors in the code…

„One function to rule them all“ – visualization of regression models in #rstats w/ #sjPlot

I’m pleased to announce the latest update from my sjPlot-package on CRAN. Beside some bug fixes and minor new features, the major update is a new function, plot_model(), which is both an enhancement and replacement of sjp.lm(), sjp.glm(), sjp.lmer(), sjp.glmer() and sjp.int(). The latter functions will become deprecated in the next updates and removed somewhen in the future.

plot_model() is a „generic“ plot function that accepts many model-objects, like lm, glm, lme, lmerMod etc. It offers various plotting types, like estimates/coefficient plots (aka forest or dot-whisker plots), marginal effect plots and plotting interaction terms, and sort of diagnostic plots.

In this blog post, I want to describe how to plot estimates as forest plots.

Weiterlesen „„One function to rule them all“ – visualization of regression models in #rstats w/ #sjPlot“