Quickly create Codeplans of your (labelled) Data #rstats

The view_df() function from the sjPlot-package creates nice „codeplans“ from your data sets, and also supports labelled data and tagged NA-values. This gives you a comprehensive, yet clear overview of your data set.

To demonstrate this function, we use a (labelled) data set from the European Social Survey. view_df() produces a HTML-file, that is – when you use RStudio – displayed in the viewer pane, or it can be opened in your webbrowser.

Weiterlesen „Quickly create Codeplans of your (labelled) Data #rstats“

R and labelled data: Using quasiquotation to add variable and value labels #rstats

Labelling data is typically a task for end-users and is applied in own scripts or functions rather than in packages. However, sometimes it can be useful for both end-users and package developers to have a flexible way to add variable and value labels to their data. In such cases, quasiquotation is helpful.

This vignette demonstrate how to use quasiquotation in sjlabelled to label your data.

Adding value labels to variables using quasiquotation

Usually, set_labels() can be used to add value labels to variables. The syntax of this function is easy to use, and set_labels()allows to add value labels to multiple variables at once, if these variables share the same value labels.

Weiterlesen „R and labelled data: Using quasiquotation to add variable and value labels #rstats“

Tagged NA values and labelled data #rstats

sjmisc-package: Working with labelled data

A major update of my sjmisc-package was just released an CRAN. A major change (see changelog for all changes )is the support of the latest release from the haven-package, a package to import and export SPSS, SAS or Stata files.

The sjmisc-package mainly addresses three domains:

  • reading and writing data between other statistical packages and R
  • functions to make working with labelled data easier
  • frequently applied recoding and variable transformation tasks, also with support for labelled data

In this post, I want to introduce the topic of labelled data and give some examples of what the sjmisc-package can do, with a special focus on tagged NA values.

Weiterlesen „Tagged NA values and labelled data #rstats“

Beautiful table outputs in R, part 2 #rstats #sjPlot

First of all, I’d like to thank my readers for the lots of feedback on my last post on beautiful outputs in R. I tried to consider all suggestions, updated the existing table-output-functions and added some new ones, which will be described in this post. The updated package is already available on CRAN.

This posting is divided in two major parts:

  1. the new functions are described, and
  2. the new features of all table-output-functions are introduced (including knitr-integration and office-import)

Read on …

No need for SPSS – beautiful output in R #rstats

Note: There’s a second part of this series here.

About one year ago, I seriously started migrating from SPSS to R. Though I’m still using SPSS (because I have to in some situations), I’m quite comfortable and happy with R now and learnt a lot in the past months. But since SPSS is still very wide spread in social sciences, I get asked every now and then, whether I really needed to learn R, because SPSS meets all my needs…

Well, learning R had at least two major benefits for me: 1.) I could improve my statistical knowledge a lot, simply by using formulas, asking why certain R commands do not automatically give the same results like SPSS, reading R resources and papers etc. and 2.) the possibilities of data visualization are way better in R than in SPSS (though SPSS can do well as well…). Of course, there are even many more reasons to use R.

Still, one thing I often miss in R is a beautiful output of simple statistics or maybe even advanced statistics. Not always as plot or graph, but neither as „cryptic“ console output. I’d like to have a simple table view, just like the SPSS output window (though the SPSS output is not „beautiful“). That’s why I started writing functions that put the results of certain statistics in HTML tables. These tables can be saved to disk or, even better for quick inspection, shown in a web browser or viewer pane (like in RStudio viewer pane).

All of the following functions are available in my sjPlot-package on CRAN.

Read on …

Easily plotting grouped bars with ggplot #rstats

This tutorial shows how to create diagrams with grouped bar charts or dot plots with ggplot. The groups can also be displayed as facet grids.

Importing the data from SPSS
All following examples are based on an imported SPSS data set. Refer to this posting for more details on how to do that and to my script page to download the scripts. This is important to know because the way the variable and value labels are accessed may depend on whether you use an imported SPSS dataset or not (i.e. you may have to change parameters to get the sample running).

You can, for instance, import your SPSS data like this, if you are using my script:

efc <- importSPSS("GER_Services_FU_PV_dt.sav")
efc_vars <- getVariableLabels(efc)
efc_labels <- getValueLabels(efc)

The R script
You can download the script from my script page. I will not describe the code in detail because the source code is (hopefully) well commented. Basically, the script just transforms the data from two variables (one count variable with categories and one grouping variables) to fit into the ggplot-requirements for plotting bar charts. You can use a lot of parameters to change the style of the output, e.g. you can plot bars or dots, dodged or stacked bars, change colors etc. and you don’t need to know how this works in ggplot. You simply pass your „preferred settings“ as parameters.

You can include the script via this single line:


Continue reading this post…

Simplify frequency plots with ggplot in R #rstats

Update March 5th
All downloads are now accessible from my script page!

This posting shows how to plot frequency plots using the ggplot-package in R. Compared to SPSS standard outputs, you will learn how to create appealing diagrams ready for use in your papers.

Frequency plots in SPSS
In SPSS, you can create frequencies of variables by using this short script:


which gives you following overview:


If you add another line to your syntax script, you can plot either bar charts (/BARCHARTS) or histograms (/HIST), too:


which gives you following results:



It seems to be more effort creating graphs like the ones above in R, but actually it’s almost easier – and you even have more beautiful plots. The only preparation you need is a general function for plotting frequencies in R.

Continue reading this post…