Data visualization in social sciences – what’s new in the sjPlot-package? #rstats

My sjPlot package just reached version 2.0 and got many updates during the couple of last months. The focus was less on adding new functions; rather, I improved existing functions by adding new smaller and bigger features to make working with the package easier and more reliable. In this blog post, I will report some of the new features.

Consistent name style of arguments

Most notably, I tried to give all package functions a consistent naming style or pattern for arguments. In previous versions, mixing different name-styles was sometimes very confusing. For example, some functions used showNA, others na.rm or show.na. Or some functions used hideLegend, some showLegend and others again show.legend.

Now, all argument names are 1) lower case, 2) dot separated for longer words and are 3) grouped according to their function (i.e., if you open the docs for ?sjt.lm, you’ll find all show. arguments, then all string. and finally all digits. arguments). I know that this means that you most likely have to completely re-write your code that uses sjPlot-function calls, but I think, in the long run, this makes working with the sjPlot package easier

Support for different model families and link functions

In previous package versions, functions related to generalized linear models (like sjp.glm or sjp.glmer) were hard coded for binomial model families for most plot types. Some effect or prediction plots only worked for logistic regression, because predictions were based on plogis. Also, automatic entitling of plots always included „probability“, even for count models.

In the past package updates and especially and the last major update, prediction or effect plot are now based on the link-inverse function of the models, so all common model families and link functions should work with sjPlot now.

Predictions and effect plots

In some cases, it is easier to interprete the predicted probabilities, incidents rates or marginal effects instead of the related estimate numbers (odds ratios, incident rate ratios, beta). For linear models (sjp.lm), linear mixed models (sjp.lmer), generalized linear models (sjp.glm) and generalized linear mixed models (sjp.glmer), there are three different plot types to plot predicted values or marginal effects:

  1. type = "slope" (or type = "fe.slope" and type = "ri.slope" for mixed models) to plot unadjusted predicted values, i.e. the relation between model terms and response.
  2. type = "eff" to plot marginal effects, adjusted for all predictors.
  3. type = "pred" (and type = "pred.fe" for mixed models) to plot predicted values against reponse, for particular model terms.

The following examples are taken from the vignette of the sjp.glm-function.

1. Predicted values, unadjusted

The predicted values from this plot type are based on the intercept’s estimate and each specific term’s estimate. All other co-variates are set to zero (i.e. ignored), which corresponds to family(fit)$linkinv(eta = b0 + bi * xi) (where xi is the estimate).

Predicted values, unadjusted

A probability curve of all predictors is plotted, which indicates the probability of the event (indicated by the response) occuring for each value of the predictor (not adjusted for remaining co-variates). In the above example, the first panel in the plot would be interpreted as: with increasing Barthel-Index (which means, better functional / physical status), the probability that caring for a dependent person is negatively perceived, decreases (in short: the less dependent a person I care for is, the less negative is the impact of care).

2. Effect plots

For marginal effects (predicted marginal probabilities resp. predicted marginal incident rates), all remaining co-variates are set to the mean, so this plot type adjusts for co-variates. Obtained results are based on the effects-package.

Marginal effects, adjusted

The effect plots can now also be non-faceted, and for selected model terms only (using the facit.grid and vars arguments).

3. Predicting values

The plot-type for predicting values did not produce any useful results in former package versions, because it just called the predict function without relationship to any predictor, or meaningful data. Now, this plot-type was completely revised. With type = "pred" (formerly, "y.pc"), you can plot predicted values for the response, related to specific model predictors. The predicted values of the response are computed, which corresponds to predict(fit, type = "response"). This plot type requires the vars argument to select specific terms that should be used for the x-axis and – optional – as grouping factor. Hence, vars must be a character vector with the names of one or two model predictors.

Predicting values

Predicting values

Table functions for mixed models

The table functions were also revised, especially for mixed models. You now have more details in the random parts section of the table, which now also shows the variance components of the random parts, or (pseudo-)r2-values.

The tables are created as HTML-page and displayed in your IDE’s viewer or your web browser. You can see many examples at the package vignettes-page. For the following example, I have taken a screenshot, because else the blog’s style sheet would break the table layout. Anyway, this is an example of a quickly produced table:

table

Closing remarks

There have been a lot of improvements made in the sjPlot package during the past update(s). Above you see example of the most obvious user-visible changes. But there were also lots of other smaller and bigger improvements. E.g. plotting functions with different plot types, like sjp.glm, have many arguments; most of them only applied to specific plot types, while they were ignored by other plot types. Now, all plot types support more or mostly all arguments, and the documentation should be clearer about what the functions and their arguments do.

I hope you’ll enjoy the sjPlot-package. Feel free to submit issues or suggestions to the dedicated GitHub-page.

You Underestimate the Power of the Dark Folgezettel

This post is a reply to Sascha’s post about Folgezettel. I recently was invited by the Niklas-Luhmann-Archiv research group, to give an overview of my Zettelkasten and discuss aspects of the technical implementation of Luhmann’s Zettelkasten method. After that, I had the chance to look at the original Zettelkasten, seeing how Luhmann actually filed notes etc. It was an interesting insight into Luhmann’s working principle, which showed me, that my approach of the Zettelkasten implementation is very similar to what Luhmann did. If you’re interested in this topic, I recommend looking at this presentation about Luhmann’s method (and the Niklas-Luhmann-Archiv-Website, of course).

„You Underestimate the Power of the Dark Folgezettel“ weiterlesen

Neue Zettelkasten-Version erschienen

In der letzten Zeit folgten relativ schnell aufeinander zwei Updates des Zettelkastens. Zum einen wurden einige kleinere, teils lästige Fehler beseitigt. Zum anderen wurden viele kleine und große Verbesserungen und Neuerungen hinzugefügt.

Ein Fokus lag dabei auf der Erweiterung der Folgezettel-Funktion. Im Detail können diese Änderungen hier nachgelesen werden. Folgezettel sind eines von mehreren zentralen Ordnungsprinzipien eines Zettelkastens, wenn man sich an Luhmanns Arbeitsweise orientieren möchte. Daher wird diese Funktion auch künftig noch weiter ausgebaut. Insbesondere soll ein separates „Folgezettelfenster“, ähnlich wie das Schreibtischfenster, die Arbeit mit Folgezetteln noch effektiver machen.

Also nichts verpassen und die aktuelle Version des Zettelkastens kostenlos laden!

Introduction to #Luhmann’s #Zettelkasten thinking and its technical implementation

I was giving a talk at Trier Digital Humanities Autumn School 2015 on Luhmann’s way of working with his Zettelkasten, and how I implemented this technique in the electronic Zettelkasten.

The core principle of Luhmann’s way to manage his notes was a combination of selective tagging, manual links between notes and a sequence of short notes and arbitrary branching („diversification“) of note sequences (see also described in this post).

To my best knowledge, there are hardly any (or even none?) tools that facilitate this workflow, except for the Zettelkasten. Please add your comments, if you know tools, or have built your own workflow that imitates Luhmann’s Zettelkasten-technique.

If you like, you can download the slides of my talk here: Introduction to Luhmanns Zettelkasten-Thinking (PDF-slides).

Luhmanns Arbeitsweise im elektronischen Zettelkasten

Ich möchte an dieser Stelle einen älteren Beitrag aufgreifen und um aktuelle Ideen zu diesem Thema erweitern. Es geht um ein aktuelles Thema in Bezug auf eine besondere Form des „Wissensmanagements“, wenn man so möchte: Wie funktionierte Luhmanns Arbeitsweise mit dem Zettelkasten und wie könnte eine Softwarelösung im digitalen Zeitalter aussehen?

„Luhmanns Arbeitsweise im elektronischen Zettelkasten“ weiterlesen

Veröffentlichung: Patientenorientierung und vernetzte Versorgung

Mein Promotionsverfahren ist endlich erfolgreich abgeschlossen, und das möchte ich zum Anlass nehmen, um mein Buch zu bewerben. In meiner Arbeit geht es um Steuerungsmechanismen von Versorgungsnetzwerken (also Kooperation von Leistungserbringern im Gesundheitssystem) und die Frage, wie sich solche Versorgungsnetze stabilisieren und Versorgungsqualität sicherstellen. Der Gegenstand wird aus einer systemtheoretischen und netzwerktheoretischen Perspektive analysiert, ergänzt durch qualitativ-empirische Analysen. Ich zitierte den Klappentext:

untitled

Patientenorientierung gewinnt zunehmend an Bedeutung und wird als wesentlicher Bestandteil zur Verbesserung der Versorgungsqualität angesehen. Für Leistungserbringer liegt die Herausforderung in der Sicherstellung einer patientenorientierten Versorgung bei finanziell begrenzten Ressourcen. Steuerungsmechanismen in der vernetzten Versorgung müssen sicherstellen, dass dies nicht zu Gunsten des Profitstrebens vernachlässigt wird. In der vorliegenden Arbeit wird der Frage nachgegangen, wie sich Versorgungsnetze koordinieren lassen und beteiligte Organisationen Patientenorientierung umsetzen.

Lüdecke D (2014) Patientenorientierung und vernetzte Versorgung. Eine qualitative Studie. Berlin, Münster: LIT-Verlag (Homepage)

Beautiful table-outputs: Summarizing mixed effects models #rstats

The current version 1.8.1 of my sjPlot package has two new functions to easily summarize mixed effects models as HTML-table: sjt.lmer and sjt.glmer. Both are very similar, so I focus on showing how to use sjt.lmer here.

# load required packages
library(sjPlot) # table functions
library(sjmisc) # sample data
library(lme4) # fitting models

Linear mixed models summaries as HTML table

The sjt.lmer function prints summaries of linear mixed models (fitted with the lmer function of the lme4-package) as nicely formatted html-tables. First, some sample models are fitted:

# load sample data
data(efc)
# prepare grouping variables
efc$grp = as.factor(efc$e15relat)
levels(x = efc$grp) <- get_val_labels(efc$e15relat)
efc$care.level <- as.factor(rec(efc$n4pstu, "0=0;1=1;2=2;3:4=4"))
levels(x = efc$care.level) <- c("none", "I", "II", "III")

# data frame for fitted model
mydf <- data.frame(neg_c_7 = as.numeric(efc$neg_c_7),
                   sex = as.factor(efc$c161sex),
                   c12hour = as.numeric(efc$c12hour),
                   barthel = as.numeric(efc$barthtot),
                   education = as.factor(efc$c172code),
                   grp = efc$grp,
                   carelevel = efc$care.level)

# fit sample models
fit1 <- lmer(neg_c_7 ~ sex + c12hour + barthel + (1|grp), data = mydf)
fit2 <- lmer(neg_c_7 ~ sex + c12hour + education + barthel + (1|grp), data = mydf)
fit3 <- lmer(neg_c_7 ~ sex + c12hour + education + barthel +
              (1|grp) +
              (1|carelevel), data = mydf)

The simplest way of producing the table output is by passing the fitted models as parameter. By default, estimates (B), confidence intervals (CI) and p-values (p) are reported. The models are named Model 1 and Model 2. The resulting table is divided into three parts:

  • Fixed parts – the model’s fixed effects coefficients, including confidence intervals and p-values.
  • Random parts – the model’s group count (amount of random intercepts) as well as the Intra-Class-Correlation-Coefficient ICC.
  • Summary – Observations, AIC etc.

„Beautiful table-outputs: Summarizing mixed effects models #rstats“ weiterlesen

Designvertrauen

The Catjects Project

Stammeskulturen hatten Vertrauen in die Magie, antike Hochkulturen in die Götter und die Moderne in die Technik. Die nächste Gesellschaft hat nur noch Vertrauen in das Design. Aber was heißt „nur“? Das Design ermöglicht beides, eine Beobachtung im Umgang mit der Welt und eine Beobachtung der Beobachter im Umgang mit der Welt. In dieser doppelten Funktion tritt es an die Stelle der Magie, der Götter und der Technik, ohne diese restlos zu ersetzen. Im Gegenteil, es übernimmt Aspekte dieser früheren Mechanismen der Ungewissheitsabsorption und entwickelt sich nur in der Hinsicht über sie hinaus, als es bestimmte Aspekte der Vernetzung von Mensch, Umwelt, Technik und Gesellschaft reflexiver behandelt, als dies möglicherweise früher der Fall war.

Denn das ist die These, die wir hier verfolgen. Jede Gesellschaft bedarf eines Mechanismus der Ungewissheitsabsorption; und das Design übernimmt diese Funktion in unserer, der nächsten nach der modernen Gesellschaft… Weiterlesen: pdf.

Thesenpapier zum Symposium…

Ursprünglichen Post anzeigen noch 15 Wörter

sjmisc – package for working with (labelled) data #rstats

The sjmisc-package

My last posting was about reading and writing data between R and other statistical packages like SPSS, Stata or SAS. After that, I decided to bundle all functions that are not directly related to plotting or printing tables, into a new package called sjmisc.

Basically, this package covers three domains of functionality:

  • reading and writing data between other statistical packages (like SPSS) and R, based on the haven and foreign packages; hence, sjmisc also includes function to work with labelled data.
  • frequently used statistical tests, or at least convenient wrappers for such test functions
  • frequently applied recoding and variable conversion tasks

In this posting, I want to give a quick and short introduction into the labeling features.

„sjmisc – package for working with (labelled) data #rstats“ weiterlesen