Plotting Likert-Scales (net stacked distributions) with ggplot #rstats

Daniel We do wie du 17. Juli 20136. März 2014 3 Minutes

Update Thanks to Forrest for finding and fixing a bug. Scripts have been updated!

Update 2 Scripts have been updated because item ordering was still buggy. Hope everything is fixed now. Very helpful in this context was the new debug feature of RStudio, that also keeps track of all variables and their content and allows step-by-step execution of your code.

First of all, credits for this script must go to Ethan Brown, whose ideas for creating Likert scales like plots with ggplot built the core of the sjp.likert function in my package.

All I did was some visual tweaking like having positive percentage values on both sides of the x-axis, adding value labels and so on… You can pass a lot of different parameters to modify the graphical output. Please refer to my blog postings on R to get some impressions of how to tweak the plot (and/or look into the script header, which includes a description of all parameters).

Now to some examples. First, install the package from CRAN and load it with library(sjPlot). Then run following code:

likert_2 <- data.frame(as.factor(sample(1:2, 500, replace=T, prob=c(0.3,0.7))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.6,0.4))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.25,0.75))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.9,0.1))),
                       as.factor(sample(1:2, 500, replace=T, prob=c(0.35,0.65))))
levels_2 <- list(c("Disagree", "Agree"))
items <- list(c("Q1", "Q2", "Q3", "Q4", "Q5"))
sjp.likert(likert_2, legendLabels=levels_2, axisLabels.x=items, orderBy="neg")

2-items Likert scale, ordered by "negative" categories. — 2-items Likert scale, ordered by „negative“ categories.

What you see above is a scale with two dimensions, ordered from highest „negative“ category to lowest. If you leave out the orderBy parameter, the plot uses the normal item order:

likert_4 <- data.frame(as.factor(sample(1:4, 500, replace=T, prob=c(0.2,0.3,0.1,0.4))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.5,0.25,0.15,0.1))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.25,0.1,0.4,0.25))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.1,0.4,0.4,0.1))),
                       as.factor(sample(1:4, 500, replace=T, prob=c(0.35,0.25,0.15,0.25))))
levels_4 <- list(c("Strongly disagree", "Disagree", "Agree", "Strongly Agree"))
items <- list(c("Q1", "Q2", "Q3", "Q4", "Q5"))
sjp.likert(likert_4, legendLabels=levels_4, axisLabels.x=items)

4-category-Likert-scale, ordered by items.

And finally, a plot with a different color set and items ordered from highest positive answer to lowest.

likert_6 <- data.frame(as.factor(sample(1:6, 500, replace=T, prob=c(0.2,0.1,0.1,0.3,0.2,0.1))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.15,0.15,0.3,0.1,0.1,0.2))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.2,0.25,0.05,0.2,0.2,0.2))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.2,0.1,0.1,0.4,0.1,0.1))),
                       as.factor(sample(1:6, 500, replace=T, prob=c(0.1,0.4,0.1,0.3,0.05,0.15))))
levels_6 <- list(c("Very strongly disagree", "Strongly disagree", "Disagree", "Agree", "Strongly Agree", "Very strongly agree"))
items <- list(c("Q1", "Q2", "Q3", "Q4", "Q5"))
sjp.likert(likert_6, legendLabels=levels_6, barColor="brown", axisLabels.x=items, orderBy="pos")

6-category-Likert-scale with different color set and ordered by "positive" categories. — 6-category-Likert-scale with different color set and ordered by „positive“ categories.

If you need to plot stacked frequencies that have no „negative“ and „positive“, but only one direction, you can also use the sjp.stackfrq function. Given that you use the likert-data frames from the above examples, you can run following code to plot stacked frequencies for scales that range from „low“ to „high“ and not from „negative“ to „positive“.

levels_42 <- list(c("Independent", "Slightly dependent", "Dependent", "Severely dependent"))
levels_62 <- list(c("Independent", "Slightly dependent", "Dependent", "Very dependent", "Severely dependent", "Very severely dependent"))
sjp.stackfrq(likert_4, legendLabels=levels_42, axisLabels.x=items)
sjp.stackfrq(likert_6, legendLabels=levels_62, axisLabels.x=items)

This produces following two plots:

Stacked frequencies of 4-category-items.

Stacked frequencies of 6-category-items.

That’s it!

Verschlagwortet mit
ggplot
Likert-Scale
R
rstats

Veröffentlicht von Daniel

Alle Beiträge von Daniel anzeigen

Veröffentlicht 17. Juli 20136. März 2014

30 Kommentare zu „Plotting Likert-Scales (net stacked distributions) with ggplot #rstats“

Forrest R. Stevens sagt:

19. Juli 2013 um 07:56

Brilliant stuff.. thank you for sharing your code!

Antworten
Jason bryer sagt:

19. Juli 2013 um 21:07

Nice post. We have put together a package to automate much of what you present here. Check out http://Jason.bryer.org/likert

Antworten
Pingback: Categorical | Pearltrees
Roman sagt:

28. November 2013 um 12:02

Hi. I really like the script. But two things:
a) Is it possible to order the last example, like the one with the pos/neg binary data?
b) I always get an error when I try to uses axisLabels.x = items. It works fine when I omit that.

Antworten
1. Daniel sagt:
  
  28. November 2013 um 14:50
  
  Hi Roman, thanks for your feedback. I assume you have used my latest package version? If so, please refer to the examples, in your case ?sjp.stackfrq. Meanwhile I have changed axisLabels.x to axisLabels.y, which will work (so simply replace „x“ with „y“).
  According to your first question: No, it’s not yet possible to order the items, but I’ll try to include this asap.
  
  Antworten
2. Daniel sagt:
  
  29. November 2013 um 22:42
  
  Hi Roman, ordering items is included in package version 0.8, which was recently accepted and published on CRAN. I assume, at the beginning of the next week there’ll be binaries available via CRAN.
  
  Antworten
Roman sagt:

29. November 2013 um 23:26

Fantastic Daniel, you’ve done a good job there! I will update the package accordingly.
Thanks
Roman

Antworten
1. Forrest Stevens sagt:
  
  30. November 2013 um 00:25
  
  Looks really good, thanks much for the update!
  
  Antworten
Juan LP sagt:

21. Januar 2014 um 08:02

Thank you very much for your post. It is very informative and the examples are straight forward.

However I ran into a small issue. I have 5 questions and likert-scale from 1-5 (very bad — very good). In one of the questions, i had no response with 1 hence that column only has values from 2-5. Therefore when it plots, the function gave the colour of 1-4 to the values 2-5 instead. I hope it makes sense. Can you please advise what can be done to improve such cases.

Antworten
1. Daniel sagt:
  
  24. Januar 2014 um 20:27
  
  Thanks for your feedback! I’ll look at the code and try to fix that bug, or, if „working as intended“, give a workaround for this.
  
  Antworten
  1. Mahtab sagt:
    
    27. März 2014 um 03:39
    
    I have the exact same issue. I have 7-point Likert items and with only 46 participants, many items do not have instances of all 7 possible responses. Any suggestions for working around the arbitrary change in colors?
Maria Bonilla sagt:

12. Februar 2014 um 23:13

Hi, good afternoon. Thank you for you post, it was very usefull for me. But I have a question: In one of the items that I am plotting, I am using a likert scale from 0-4 , but 0 has a very little frequency and the value label is overlapped by the rest of the plot. I would like to know how to fix this, please… I try to reduce the valueLabelSize, but it does not work it. Thank you for your time.

Antworten
1. Daniel sagt:
  
  13. Februar 2014 um 15:25
  
  Hi Maria, thanks for your feedback! I added the paramerter „jitterValueLabels“ to the functions „sjp.likert“, „sjp.xtab“ and „sjp.stackfrq“ to avoid overlapping of narrow displayed value labels. Will be included in the next package-update…
  
  Antworten
  1. Maria Bonilla sagt:
    
    13. Februar 2014 um 15:34
    
    Thank you for your answer!
Pingback: Beautiful table outputs in R, part 2 #rstats #sjPlot | Strenge Jacke!
Prestone Adie (@AdiePrestone) sagt:

6. März 2014 um 13:37

Hi Daniel, is it possible to use your function where i have different responses for different questions i.e n=c(x,x+3,x-7) applicable for situations where some questions were left unanswered by the respondents

Antworten
1. Daniel sagt:
  
  6. März 2014 um 14:17
  
  If you mean missing values, that’s no problem. Each item (question) needs to have the same length, so you have to declare the missing values as NA. Since the items are passed as data frame, with each column representing one item, you need to have equal row length anyway…
  
  Antworten
Mike sagt:

11. März 2014 um 12:36

Any way to have a „All survey responses“ bar added to the top? Like in Naomi Robbins example: http://www.amstat.org/sections/srms/proceedings/y2011/Files/300784_64164.pdf

Antworten
1. Daniel sagt:
  
  11. März 2014 um 12:43
  
  I’m thinking about completely re-writing this function to get results similar to this graph. I guess I could add a total response category then.
  
  Antworten
albi sagt:

17. Juli 2014 um 18:04

Nice! In likert_2 perhaps you have to replace „axisLabels.x=items“ with „axisLabels.y=items“.

Antworten
Pingback: リッカート法によるアンケートの相関分析を R でする方法のメモ | Futurismo
JABSONHPOLIVEIRA (@JABSONHERBER) sagt:

21. Dezember 2016 um 01:11

Hallo Daniel!
Könnten Sie bitte den Code zu aktualisieren?
Er arbeitet nicht mehr.
Danke.

Antworten
1. Daniel sagt:
  
  21. Dezember 2016 um 17:20
  
  Die aktuellen Beispiele für das Paket sind grundsätzlich hier zu finden:
  http://strengejacke.de/sjPlot/
  
  Antworten
  1. Oliveira sagt:
    
    4. Januar 2017 um 05:59
    
    Vielen Dank!
Adam Zíka sagt:

26. Januar 2017 um 15:50

Hi Daniel!

I would like to ask, is it possible to plot a likert graph with groups of different sample sizes? I am struggling with it. What I did is that I created a data.frame with two groups and I made the size equal by inputting ‚NA‘ to the group with lower sample size. But a problem is that then in the legend, I see ‚NA‘. Is there a solution for this?

Adam

Antworten
1. Daniel sagt:
  
  26. Januar 2017 um 21:53
  
  Hi Adam, I’m not quite sure where your problem exactly lies… do you have a small reproducible example? You could also send it by email to me.
  
  Antworten
Adam Jauregui sagt:

18. Juli 2017 um 23:01

Hi Daniel,

I love the sjp.likert function you’ve developed here. I am currently trying to plot a five-point likert scale, which includes a „don’t know“ option. I have also tried Jason’s likert package, and his package has an option wherewe can plot the bars by a group (on his website, he grouped by country). I have not been able to find a similar parameter for sjp.likert. Can you tell me if it does?

Thanks,
Adam

Antworten
Rayssa sagt:

9. Oktober 2018 um 20:37

Hi Daniel!
First of all thank you so much for your post, it was very important to me!!!
But I’m having trouble reproducing your examples. Because on your site all the probabilities appear on the left side but not mine. And when I try to with my examples also did not appear the percentages of the left. How did you make it work?

Antworten
1. Rayssa sagt:
  
  13. November 2018 um 20:05
  
  I got it! I updated the packages
  
  Antworten
lichza sagt:

23. Januar 2019 um 06:02

Hi Daniel!
Nice package. I am trying to replicate your examples, but your package has been updated since this post.
Are you able to update the example codes to match the updated package?
Thanks!

Antworten