Plotting Likert-Scales (net stacked distributions) with ggplot #rstats

Update Thanks to Forrest for finding and fixing a bug. Scripts have been updated!

Update 2 Scripts have been updated because item ordering was still buggy. Hope everything is fixed now. Very helpful in this context was the new debug feature of RStudio, that also keeps track of all variables and their content and allows step-by-step execution of your code.

First of all, credits for this script must go to Ethan Brown, whose ideas for creating Likert scales like plots with ggplot built the core of the sjp.likert function in my package.

All I did was some visual tweaking like having positive percentage values on both sides of the x-axis, adding value labels and so on… You can pass a lot of different parameters to modify the graphical output. Please refer to my blog postings on R to get some impressions of how to tweak the plot (and/or look into the script header, which includes a description of all parameters).

Now to some examples. First, install the package from CRAN and load it with library(sjPlot). Then run following code:

likert_2 <- data.frame(as.factor(sample(1:2, 500, replace=T, prob=c(0.3,0.7))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.6,0.4))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.25,0.75))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.9,0.1))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.35,0.65))))
levels_2 <- list(c("Disagree", "Agree"))
items <- list(c("Q1", "Q2", "Q3", "Q4", "Q5"))
sjp.likert(likert_2, legendLabels=levels_2, axisLabels.x=items, orderBy="neg")

2-items Likert scale, ordered by "negative" categories.
2-items Likert scale, ordered by „negative“ categories.

What you see above is a scale with two dimensions, ordered from highest „negative“ category to lowest. If you leave out the orderBy parameter, the plot uses the normal item order:

likert_4 <- data.frame(as.factor(sample(1:4, 500, replace=T, prob=c(0.2,0.3,0.1,0.4))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.5,0.25,0.15,0.1))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.25,0.1,0.4,0.25))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.1,0.4,0.4,0.1))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.35,0.25,0.15,0.25))))
levels_4 <- list(c("Strongly disagree", "Disagree", "Agree", "Strongly Agree"))
items <- list(c("Q1", "Q2", "Q3", "Q4", "Q5"))
sjp.likert(likert_4, legendLabels=levels_4, axisLabels.x=items)

4-category-Likert-scale, ordered by items.
4-category-Likert-scale, ordered by items.

And finally, a plot with a different color set and items ordered from highest positive answer to lowest.

likert_6 <- data.frame(as.factor(sample(1:6, 500, replace=T, prob=c(0.2,0.1,0.1,0.3,0.2,0.1))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.15,0.15,0.3,0.1,0.1,0.2))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.2,0.25,0.05,0.2,0.2,0.2))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.2,0.1,0.1,0.4,0.1,0.1))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.1,0.4,0.1,0.3,0.05,0.15))))
levels_6 <- list(c("Very strongly disagree", "Strongly disagree", "Disagree", "Agree", "Strongly Agree", "Very strongly agree"))
items <- list(c("Q1", "Q2", "Q3", "Q4", "Q5"))
sjp.likert(likert_6, legendLabels=levels_6, barColor="brown", axisLabels.x=items, orderBy="pos")
6-category-Likert-scale with different color set and ordered by "positive" categories.
6-category-Likert-scale with different color set and ordered by „positive“ categories.

If you need to plot stacked frequencies that have no „negative“ and „positive“, but only one direction, you can also use the sjp.stackfrq function. Given that you use the likert-data frames from the above examples, you can run following code to plot stacked frequencies for scales that range from „low“ to „high“ and not from „negative“ to „positive“.

levels_42 <- list(c("Independent", "Slightly dependent", "Dependent", "Severely dependent"))
levels_62 <- list(c("Independent", "Slightly dependent", "Dependent", "Very dependent", "Severely dependent", "Very severely dependent"))
sjp.stackfrq(likert_4, legendLabels=levels_42, axisLabels.x=items)
sjp.stackfrq(likert_6, legendLabels=levels_62, axisLabels.x=items)

This produces following two plots:

Stacked frequencies of 4-category-items.
Stacked frequencies of 4-category-items.

Stacked frequencies of 6-category-items.
Stacked frequencies of 6-category-items.

That’s it!


26 thoughts on “Plotting Likert-Scales (net stacked distributions) with ggplot #rstats

  1. Hi. I really like the script. But two things:
    a) Is it possible to order the last example, like the one with the pos/neg binary data?
    b) I always get an error when I try to uses axisLabels.x = items. It works fine when I omit that.

    1. Hi Roman, thanks for your feedback. I assume you have used my latest package version? If so, please refer to the examples, in your case ?sjp.stackfrq. Meanwhile I have changed axisLabels.x to axisLabels.y, which will work (so simply replace „x“ with „y“).
      According to your first question: No, it’s not yet possible to order the items, but I’ll try to include this asap.

    2. Hi Roman, ordering items is included in package version 0.8, which was recently accepted and published on CRAN. I assume, at the beginning of the next week there’ll be binaries available via CRAN.

  2. Thank you very much for your post. It is very informative and the examples are straight forward.

    However I ran into a small issue. I have 5 questions and likert-scale from 1-5 (very bad — very good). In one of the questions, i had no response with 1 hence that column only has values from 2-5. Therefore when it plots, the function gave the colour of 1-4 to the values 2-5 instead. I hope it makes sense. Can you please advise what can be done to improve such cases.

      1. I have the exact same issue. I have 7-point Likert items and with only 46 participants, many items do not have instances of all 7 possible responses. Any suggestions for working around the arbitrary change in colors?

  3. Hi, good afternoon. Thank you for you post, it was very usefull for me. But I have a question: In one of the items that I am plotting, I am using a likert scale from 0-4 , but 0 has a very little frequency and the value label is overlapped by the rest of the plot. I would like to know how to fix this, please… I try to reduce the valueLabelSize, but it does not work it. Thank you for your time.

    1. Hi Maria, thanks for your feedback! I added the paramerter „jitterValueLabels“ to the functions „sjp.likert“, „sjp.xtab“ and „sjp.stackfrq“ to avoid overlapping of narrow displayed value labels. Will be included in the next package-update…

    1. If you mean missing values, that’s no problem. Each item (question) needs to have the same length, so you have to declare the missing values as NA. Since the items are passed as data frame, with each column representing one item, you need to have equal row length anyway…

  4. Hi Daniel!

    I would like to ask, is it possible to plot a likert graph with groups of different sample sizes? I am struggling with it. What I did is that I created a data.frame with two groups and I made the size equal by inputting ‚NA‘ to the group with lower sample size. But a problem is that then in the legend, I see ‚NA‘. Is there a solution for this?


Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

Du kommentierst mit Deinem Abmelden / Ändern )


Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )


Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s